COUNT(id) or MAX(id) - which is faster?How to efficiently count the number of keys/properties of an object in JavaScript?Which “href” value should I use for JavaScript links, “#” or “javascript:void(0)”?Which is faster: Stack allocation or Heap allocationSQL select only rows with max value on a columnWhy are elementwise additions much faster in separate loops than in a combined loop?Why is it faster to process a sorted array than an unsorted array?Why does Python code run faster in a function?Is < faster than <=?Which is faster: while(1) or while(2)?Why is [] faster than list()?
What do you call something that goes against the spirit of the law, but is legal when interpreting the law to the letter?
Latin words with no plurals in English
Is domain driven design an anti-SQL pattern?
Is "plugging out" electronic devices an American expression?
How would photo IDs work for shapeshifters?
Does the average primeness of natural numbers tend to zero?
Why is the design of haulage companies so “special”?
Denied boarding due to overcrowding, Sparpreis ticket. What are my rights?
Does it makes sense to buy a new cycle to learn riding?
What do the Banks children have against barley water?
How to make payment on the internet without leaving a money trail?
How is it possible for user's password to be changed after storage was encrypted? (on OS X, Android)
Information to fellow intern about hiring?
Are white and non-white police officers equally likely to kill black suspects?
Why airport relocation isn't done gradually?
Doomsday-clock for my fantasy planet
How to answer pointed "are you quitting" questioning when I don't want them to suspect
Unbreakable Formation vs. Cry of the Carnarium
Is it legal to have the "// (c) 2019 John Smith" header in all files when there are hundreds of contributors?
What is the offset in a seaplane's hull?
Was there ever an axiom rendered a theorem?
I’m planning on buying a laser printer but concerned about the life cycle of toner in the machine
Patience, young "Padovan"
Piano - What is the notation for a double stop where both notes in the double stop are different lengths?
COUNT(id) or MAX(id) - which is faster?
How to efficiently count the number of keys/properties of an object in JavaScript?Which “href” value should I use for JavaScript links, “#” or “javascript:void(0)”?Which is faster: Stack allocation or Heap allocationSQL select only rows with max value on a columnWhy are elementwise additions much faster in separate loops than in a combined loop?Why is it faster to process a sorted array than an unsorted array?Why does Python code run faster in a function?Is < faster than <=?Which is faster: while(1) or while(2)?Why is [] faster than list()?
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;
i have a web server, that has my own messaging system implemented.
I am at phase, when i need to create API, that checks, if the user has new message(s).
My DB table is simple:
ID - Auto Increment, Primary Key (Bigint)
Sender - Varchar (32) // Foreign Key to UserID hash from Users DB Table
Recipient - Varchar (32) // Foreign Key to UserID hash from Users DB Table
Message - Varchar (256) //UTF8 BIN
I am considering to make an api, that will estimate, if there are new messages for given user. I am thinking to use one of these methods:
A) Select count(*) of messages where sender or recipient is me (if this number > previous number, i have new message)
B) Select max(ID) of messages where sender or recipient is me (if max(ID) > than previous number, i have new message)
My question is: Can i calculate somehow, what method will consume less server resources? Or is there some article? Maybe another method i not mentioned?
php mysql performance
add a comment |
i have a web server, that has my own messaging system implemented.
I am at phase, when i need to create API, that checks, if the user has new message(s).
My DB table is simple:
ID - Auto Increment, Primary Key (Bigint)
Sender - Varchar (32) // Foreign Key to UserID hash from Users DB Table
Recipient - Varchar (32) // Foreign Key to UserID hash from Users DB Table
Message - Varchar (256) //UTF8 BIN
I am considering to make an api, that will estimate, if there are new messages for given user. I am thinking to use one of these methods:
A) Select count(*) of messages where sender or recipient is me (if this number > previous number, i have new message)
B) Select max(ID) of messages where sender or recipient is me (if max(ID) > than previous number, i have new message)
My question is: Can i calculate somehow, what method will consume less server resources? Or is there some article? Maybe another method i not mentioned?
php mysql performance
1
I think you would be better off by adding a timestamp column and checking against that value to see if there are newer records.
– Dharman
2 hours ago
Either querying a timestamp or the ID, useMAX()
on that column, and make sure it's indexed with(user_id, timestamp)
.
– The Impaler
1 hour ago
@Dharman i was thinking of it. But it costs extra DB space, also i am not sure if it will be faster than one of my methods. I am storing the simple number (of current messages) in usernames table
– FeHora
1 hour ago
Calculate? No idea. But you can measure it. Fire off a few thousands of each query and watch machine metrics (cpu%, mem%, load average, etc.)
– Sergio Tulentsev
1 hour ago
While there is a good answer to this question below, I suspect you might be optimizing on something that turns out not to be important. And unless you anticipate having literally millions of messages, I wouldn't worry about disk space, especially because the timestamp is small compared to your other fields. If you add timestamps, your table will be about 5MB larger for each million messages. That's really nothing.
– Jerry
1 hour ago
add a comment |
i have a web server, that has my own messaging system implemented.
I am at phase, when i need to create API, that checks, if the user has new message(s).
My DB table is simple:
ID - Auto Increment, Primary Key (Bigint)
Sender - Varchar (32) // Foreign Key to UserID hash from Users DB Table
Recipient - Varchar (32) // Foreign Key to UserID hash from Users DB Table
Message - Varchar (256) //UTF8 BIN
I am considering to make an api, that will estimate, if there are new messages for given user. I am thinking to use one of these methods:
A) Select count(*) of messages where sender or recipient is me (if this number > previous number, i have new message)
B) Select max(ID) of messages where sender or recipient is me (if max(ID) > than previous number, i have new message)
My question is: Can i calculate somehow, what method will consume less server resources? Or is there some article? Maybe another method i not mentioned?
php mysql performance
i have a web server, that has my own messaging system implemented.
I am at phase, when i need to create API, that checks, if the user has new message(s).
My DB table is simple:
ID - Auto Increment, Primary Key (Bigint)
Sender - Varchar (32) // Foreign Key to UserID hash from Users DB Table
Recipient - Varchar (32) // Foreign Key to UserID hash from Users DB Table
Message - Varchar (256) //UTF8 BIN
I am considering to make an api, that will estimate, if there are new messages for given user. I am thinking to use one of these methods:
A) Select count(*) of messages where sender or recipient is me (if this number > previous number, i have new message)
B) Select max(ID) of messages where sender or recipient is me (if max(ID) > than previous number, i have new message)
My question is: Can i calculate somehow, what method will consume less server resources? Or is there some article? Maybe another method i not mentioned?
php mysql performance
php mysql performance
edited 1 hour ago
Kaii
15.6k22850
15.6k22850
asked 2 hours ago
FeHoraFeHora
385
385
1
I think you would be better off by adding a timestamp column and checking against that value to see if there are newer records.
– Dharman
2 hours ago
Either querying a timestamp or the ID, useMAX()
on that column, and make sure it's indexed with(user_id, timestamp)
.
– The Impaler
1 hour ago
@Dharman i was thinking of it. But it costs extra DB space, also i am not sure if it will be faster than one of my methods. I am storing the simple number (of current messages) in usernames table
– FeHora
1 hour ago
Calculate? No idea. But you can measure it. Fire off a few thousands of each query and watch machine metrics (cpu%, mem%, load average, etc.)
– Sergio Tulentsev
1 hour ago
While there is a good answer to this question below, I suspect you might be optimizing on something that turns out not to be important. And unless you anticipate having literally millions of messages, I wouldn't worry about disk space, especially because the timestamp is small compared to your other fields. If you add timestamps, your table will be about 5MB larger for each million messages. That's really nothing.
– Jerry
1 hour ago
add a comment |
1
I think you would be better off by adding a timestamp column and checking against that value to see if there are newer records.
– Dharman
2 hours ago
Either querying a timestamp or the ID, useMAX()
on that column, and make sure it's indexed with(user_id, timestamp)
.
– The Impaler
1 hour ago
@Dharman i was thinking of it. But it costs extra DB space, also i am not sure if it will be faster than one of my methods. I am storing the simple number (of current messages) in usernames table
– FeHora
1 hour ago
Calculate? No idea. But you can measure it. Fire off a few thousands of each query and watch machine metrics (cpu%, mem%, load average, etc.)
– Sergio Tulentsev
1 hour ago
While there is a good answer to this question below, I suspect you might be optimizing on something that turns out not to be important. And unless you anticipate having literally millions of messages, I wouldn't worry about disk space, especially because the timestamp is small compared to your other fields. If you add timestamps, your table will be about 5MB larger for each million messages. That's really nothing.
– Jerry
1 hour ago
1
1
I think you would be better off by adding a timestamp column and checking against that value to see if there are newer records.
– Dharman
2 hours ago
I think you would be better off by adding a timestamp column and checking against that value to see if there are newer records.
– Dharman
2 hours ago
Either querying a timestamp or the ID, use
MAX()
on that column, and make sure it's indexed with (user_id, timestamp)
.– The Impaler
1 hour ago
Either querying a timestamp or the ID, use
MAX()
on that column, and make sure it's indexed with (user_id, timestamp)
.– The Impaler
1 hour ago
@Dharman i was thinking of it. But it costs extra DB space, also i am not sure if it will be faster than one of my methods. I am storing the simple number (of current messages) in usernames table
– FeHora
1 hour ago
@Dharman i was thinking of it. But it costs extra DB space, also i am not sure if it will be faster than one of my methods. I am storing the simple number (of current messages) in usernames table
– FeHora
1 hour ago
Calculate? No idea. But you can measure it. Fire off a few thousands of each query and watch machine metrics (cpu%, mem%, load average, etc.)
– Sergio Tulentsev
1 hour ago
Calculate? No idea. But you can measure it. Fire off a few thousands of each query and watch machine metrics (cpu%, mem%, load average, etc.)
– Sergio Tulentsev
1 hour ago
While there is a good answer to this question below, I suspect you might be optimizing on something that turns out not to be important. And unless you anticipate having literally millions of messages, I wouldn't worry about disk space, especially because the timestamp is small compared to your other fields. If you add timestamps, your table will be about 5MB larger for each million messages. That's really nothing.
– Jerry
1 hour ago
While there is a good answer to this question below, I suspect you might be optimizing on something that turns out not to be important. And unless you anticipate having literally millions of messages, I wouldn't worry about disk space, especially because the timestamp is small compared to your other fields. If you add timestamps, your table will be about 5MB larger for each million messages. That's really nothing.
– Jerry
1 hour ago
add a comment |
2 Answers
2
active
oldest
votes
In MySQL InnoDB, SELECT COUNT(*) WHERE secondary_index = ?
is an expensive operation and when the user has a lot of messages, this query might take a long time. Even when using an index, the engine still needs to count all matching records.
On the other hand, SELECT MAX(id) WHERE secondary_index = ?
can deliver the highest id in that index very efficiently and runs in constant speed by doing a so-called loose index scan.
If you want to understand why, consider looking up the "B-Tree+" data structure which InnoDB uses to organise its data.
I suggest you go with SELECT MAX(id)
, if the requirement is only to check if there are new messages (and not the count of them).
Also, if you rely on the message count you might open a gap for race conditions. What if the user deletes a message and receives a new one between two polling intervals?
refer: dba.stackexchange.com/questions/130780/mysql-count-performance
– Kaii
1 hour ago
1
"SELECT MAX(id) will always use the primary index" - yeah, except for the cases when there's awhere
on an unindexed field.
– Sergio Tulentsev
1 hour ago
@SergioTulentsev i forgot to mention in my main post, sender and recipient are foreign keys to user-hash (ID) - primary key in users table. So it will be indexed always.
– FeHora
1 hour ago
@Kaii "Also, if you rely on the message count you might open a gap for race conditions. What if the user deletes a message and receives a new one between two polling intervals?" if the user deletes the message it just become hidden for security reasons, it will have a value hidden:true. but the count will not change
– FeHora
1 hour ago
2
If there's an index ona
, thenSELECT MAX(id) FROM tbl WHERE a=constant
uses a so-called loose index scan. Those are almost miraculously fast.SELECT COUNT(*) FROM tbl WHERE a=constant
does a tight index scan, which is not as fast.
– O. Jones
1 hour ago
|
show 4 more comments
To have the information that someone has new messages - do exactly that. Update the field in users
table (I'm assuming that's the name) when a new message is recorded in the system. You have the recipient's ID, that's all you need. You can create an after insert
trigger (assumption: there's users2messages
table) that updates users table with a boolean flag indicating there's a message.
This approach is by far faster than counting indexes, be the index primary or secondary. When the user performs an action, you can update the users
table with has_messages = 0
, when a new message arrives - you update the table with has_messages = 1
. It's simple, it works, it scales and using triggers to maintain it makes it easy and seamless.
I'm sure there will be nay-sayers who don't like triggers, you can do it manually at the point of associating a user with a new message.
triggers aside, looking up a row using the PK and also reading it to check the boolean is still more expensive than executing a single loose index scan. It gets worse when you also add a WHERE clause to check the boolean flag because of the low cardinality even if you index that field. Sorry to tell you you that, but you have a misunderstanding there.
– Kaii
1 hour ago
@Mjh i know about that.. but it's definitely more expensive than my suggested methods, because it contains (at least) 1x update + 1x select
– FeHora
55 mins ago
@KaiiSELECT has_messages FROM users WHERE id = 1;
is the fastest query there is. It's aneq_ref
which is infinitely faster than counting a number of records in the table. The boolean field is not in theWHERE
clause, the primary key is. Please, assume better next time. In regards to updating the table: the update is fast as well, it handles a single row located using the primary key. If the field is already containing the value that you're updating to, no actual disk I/O occurs and there's a minimal performance penalty. Much less than counting the records. You can measure.
– Mjh
41 mins ago
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55581114%2fcountid-or-maxid-which-is-faster%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
In MySQL InnoDB, SELECT COUNT(*) WHERE secondary_index = ?
is an expensive operation and when the user has a lot of messages, this query might take a long time. Even when using an index, the engine still needs to count all matching records.
On the other hand, SELECT MAX(id) WHERE secondary_index = ?
can deliver the highest id in that index very efficiently and runs in constant speed by doing a so-called loose index scan.
If you want to understand why, consider looking up the "B-Tree+" data structure which InnoDB uses to organise its data.
I suggest you go with SELECT MAX(id)
, if the requirement is only to check if there are new messages (and not the count of them).
Also, if you rely on the message count you might open a gap for race conditions. What if the user deletes a message and receives a new one between two polling intervals?
refer: dba.stackexchange.com/questions/130780/mysql-count-performance
– Kaii
1 hour ago
1
"SELECT MAX(id) will always use the primary index" - yeah, except for the cases when there's awhere
on an unindexed field.
– Sergio Tulentsev
1 hour ago
@SergioTulentsev i forgot to mention in my main post, sender and recipient are foreign keys to user-hash (ID) - primary key in users table. So it will be indexed always.
– FeHora
1 hour ago
@Kaii "Also, if you rely on the message count you might open a gap for race conditions. What if the user deletes a message and receives a new one between two polling intervals?" if the user deletes the message it just become hidden for security reasons, it will have a value hidden:true. but the count will not change
– FeHora
1 hour ago
2
If there's an index ona
, thenSELECT MAX(id) FROM tbl WHERE a=constant
uses a so-called loose index scan. Those are almost miraculously fast.SELECT COUNT(*) FROM tbl WHERE a=constant
does a tight index scan, which is not as fast.
– O. Jones
1 hour ago
|
show 4 more comments
In MySQL InnoDB, SELECT COUNT(*) WHERE secondary_index = ?
is an expensive operation and when the user has a lot of messages, this query might take a long time. Even when using an index, the engine still needs to count all matching records.
On the other hand, SELECT MAX(id) WHERE secondary_index = ?
can deliver the highest id in that index very efficiently and runs in constant speed by doing a so-called loose index scan.
If you want to understand why, consider looking up the "B-Tree+" data structure which InnoDB uses to organise its data.
I suggest you go with SELECT MAX(id)
, if the requirement is only to check if there are new messages (and not the count of them).
Also, if you rely on the message count you might open a gap for race conditions. What if the user deletes a message and receives a new one between two polling intervals?
refer: dba.stackexchange.com/questions/130780/mysql-count-performance
– Kaii
1 hour ago
1
"SELECT MAX(id) will always use the primary index" - yeah, except for the cases when there's awhere
on an unindexed field.
– Sergio Tulentsev
1 hour ago
@SergioTulentsev i forgot to mention in my main post, sender and recipient are foreign keys to user-hash (ID) - primary key in users table. So it will be indexed always.
– FeHora
1 hour ago
@Kaii "Also, if you rely on the message count you might open a gap for race conditions. What if the user deletes a message and receives a new one between two polling intervals?" if the user deletes the message it just become hidden for security reasons, it will have a value hidden:true. but the count will not change
– FeHora
1 hour ago
2
If there's an index ona
, thenSELECT MAX(id) FROM tbl WHERE a=constant
uses a so-called loose index scan. Those are almost miraculously fast.SELECT COUNT(*) FROM tbl WHERE a=constant
does a tight index scan, which is not as fast.
– O. Jones
1 hour ago
|
show 4 more comments
In MySQL InnoDB, SELECT COUNT(*) WHERE secondary_index = ?
is an expensive operation and when the user has a lot of messages, this query might take a long time. Even when using an index, the engine still needs to count all matching records.
On the other hand, SELECT MAX(id) WHERE secondary_index = ?
can deliver the highest id in that index very efficiently and runs in constant speed by doing a so-called loose index scan.
If you want to understand why, consider looking up the "B-Tree+" data structure which InnoDB uses to organise its data.
I suggest you go with SELECT MAX(id)
, if the requirement is only to check if there are new messages (and not the count of them).
Also, if you rely on the message count you might open a gap for race conditions. What if the user deletes a message and receives a new one between two polling intervals?
In MySQL InnoDB, SELECT COUNT(*) WHERE secondary_index = ?
is an expensive operation and when the user has a lot of messages, this query might take a long time. Even when using an index, the engine still needs to count all matching records.
On the other hand, SELECT MAX(id) WHERE secondary_index = ?
can deliver the highest id in that index very efficiently and runs in constant speed by doing a so-called loose index scan.
If you want to understand why, consider looking up the "B-Tree+" data structure which InnoDB uses to organise its data.
I suggest you go with SELECT MAX(id)
, if the requirement is only to check if there are new messages (and not the count of them).
Also, if you rely on the message count you might open a gap for race conditions. What if the user deletes a message and receives a new one between two polling intervals?
edited 1 hour ago
answered 1 hour ago
KaiiKaii
15.6k22850
15.6k22850
refer: dba.stackexchange.com/questions/130780/mysql-count-performance
– Kaii
1 hour ago
1
"SELECT MAX(id) will always use the primary index" - yeah, except for the cases when there's awhere
on an unindexed field.
– Sergio Tulentsev
1 hour ago
@SergioTulentsev i forgot to mention in my main post, sender and recipient are foreign keys to user-hash (ID) - primary key in users table. So it will be indexed always.
– FeHora
1 hour ago
@Kaii "Also, if you rely on the message count you might open a gap for race conditions. What if the user deletes a message and receives a new one between two polling intervals?" if the user deletes the message it just become hidden for security reasons, it will have a value hidden:true. but the count will not change
– FeHora
1 hour ago
2
If there's an index ona
, thenSELECT MAX(id) FROM tbl WHERE a=constant
uses a so-called loose index scan. Those are almost miraculously fast.SELECT COUNT(*) FROM tbl WHERE a=constant
does a tight index scan, which is not as fast.
– O. Jones
1 hour ago
|
show 4 more comments
refer: dba.stackexchange.com/questions/130780/mysql-count-performance
– Kaii
1 hour ago
1
"SELECT MAX(id) will always use the primary index" - yeah, except for the cases when there's awhere
on an unindexed field.
– Sergio Tulentsev
1 hour ago
@SergioTulentsev i forgot to mention in my main post, sender and recipient are foreign keys to user-hash (ID) - primary key in users table. So it will be indexed always.
– FeHora
1 hour ago
@Kaii "Also, if you rely on the message count you might open a gap for race conditions. What if the user deletes a message and receives a new one between two polling intervals?" if the user deletes the message it just become hidden for security reasons, it will have a value hidden:true. but the count will not change
– FeHora
1 hour ago
2
If there's an index ona
, thenSELECT MAX(id) FROM tbl WHERE a=constant
uses a so-called loose index scan. Those are almost miraculously fast.SELECT COUNT(*) FROM tbl WHERE a=constant
does a tight index scan, which is not as fast.
– O. Jones
1 hour ago
refer: dba.stackexchange.com/questions/130780/mysql-count-performance
– Kaii
1 hour ago
refer: dba.stackexchange.com/questions/130780/mysql-count-performance
– Kaii
1 hour ago
1
1
"SELECT MAX(id) will always use the primary index" - yeah, except for the cases when there's a
where
on an unindexed field.– Sergio Tulentsev
1 hour ago
"SELECT MAX(id) will always use the primary index" - yeah, except for the cases when there's a
where
on an unindexed field.– Sergio Tulentsev
1 hour ago
@SergioTulentsev i forgot to mention in my main post, sender and recipient are foreign keys to user-hash (ID) - primary key in users table. So it will be indexed always.
– FeHora
1 hour ago
@SergioTulentsev i forgot to mention in my main post, sender and recipient are foreign keys to user-hash (ID) - primary key in users table. So it will be indexed always.
– FeHora
1 hour ago
@Kaii "Also, if you rely on the message count you might open a gap for race conditions. What if the user deletes a message and receives a new one between two polling intervals?" if the user deletes the message it just become hidden for security reasons, it will have a value hidden:true. but the count will not change
– FeHora
1 hour ago
@Kaii "Also, if you rely on the message count you might open a gap for race conditions. What if the user deletes a message and receives a new one between two polling intervals?" if the user deletes the message it just become hidden for security reasons, it will have a value hidden:true. but the count will not change
– FeHora
1 hour ago
2
2
If there's an index on
a
, then SELECT MAX(id) FROM tbl WHERE a=constant
uses a so-called loose index scan. Those are almost miraculously fast. SELECT COUNT(*) FROM tbl WHERE a=constant
does a tight index scan, which is not as fast.– O. Jones
1 hour ago
If there's an index on
a
, then SELECT MAX(id) FROM tbl WHERE a=constant
uses a so-called loose index scan. Those are almost miraculously fast. SELECT COUNT(*) FROM tbl WHERE a=constant
does a tight index scan, which is not as fast.– O. Jones
1 hour ago
|
show 4 more comments
To have the information that someone has new messages - do exactly that. Update the field in users
table (I'm assuming that's the name) when a new message is recorded in the system. You have the recipient's ID, that's all you need. You can create an after insert
trigger (assumption: there's users2messages
table) that updates users table with a boolean flag indicating there's a message.
This approach is by far faster than counting indexes, be the index primary or secondary. When the user performs an action, you can update the users
table with has_messages = 0
, when a new message arrives - you update the table with has_messages = 1
. It's simple, it works, it scales and using triggers to maintain it makes it easy and seamless.
I'm sure there will be nay-sayers who don't like triggers, you can do it manually at the point of associating a user with a new message.
triggers aside, looking up a row using the PK and also reading it to check the boolean is still more expensive than executing a single loose index scan. It gets worse when you also add a WHERE clause to check the boolean flag because of the low cardinality even if you index that field. Sorry to tell you you that, but you have a misunderstanding there.
– Kaii
1 hour ago
@Mjh i know about that.. but it's definitely more expensive than my suggested methods, because it contains (at least) 1x update + 1x select
– FeHora
55 mins ago
@KaiiSELECT has_messages FROM users WHERE id = 1;
is the fastest query there is. It's aneq_ref
which is infinitely faster than counting a number of records in the table. The boolean field is not in theWHERE
clause, the primary key is. Please, assume better next time. In regards to updating the table: the update is fast as well, it handles a single row located using the primary key. If the field is already containing the value that you're updating to, no actual disk I/O occurs and there's a minimal performance penalty. Much less than counting the records. You can measure.
– Mjh
41 mins ago
add a comment |
To have the information that someone has new messages - do exactly that. Update the field in users
table (I'm assuming that's the name) when a new message is recorded in the system. You have the recipient's ID, that's all you need. You can create an after insert
trigger (assumption: there's users2messages
table) that updates users table with a boolean flag indicating there's a message.
This approach is by far faster than counting indexes, be the index primary or secondary. When the user performs an action, you can update the users
table with has_messages = 0
, when a new message arrives - you update the table with has_messages = 1
. It's simple, it works, it scales and using triggers to maintain it makes it easy and seamless.
I'm sure there will be nay-sayers who don't like triggers, you can do it manually at the point of associating a user with a new message.
triggers aside, looking up a row using the PK and also reading it to check the boolean is still more expensive than executing a single loose index scan. It gets worse when you also add a WHERE clause to check the boolean flag because of the low cardinality even if you index that field. Sorry to tell you you that, but you have a misunderstanding there.
– Kaii
1 hour ago
@Mjh i know about that.. but it's definitely more expensive than my suggested methods, because it contains (at least) 1x update + 1x select
– FeHora
55 mins ago
@KaiiSELECT has_messages FROM users WHERE id = 1;
is the fastest query there is. It's aneq_ref
which is infinitely faster than counting a number of records in the table. The boolean field is not in theWHERE
clause, the primary key is. Please, assume better next time. In regards to updating the table: the update is fast as well, it handles a single row located using the primary key. If the field is already containing the value that you're updating to, no actual disk I/O occurs and there's a minimal performance penalty. Much less than counting the records. You can measure.
– Mjh
41 mins ago
add a comment |
To have the information that someone has new messages - do exactly that. Update the field in users
table (I'm assuming that's the name) when a new message is recorded in the system. You have the recipient's ID, that's all you need. You can create an after insert
trigger (assumption: there's users2messages
table) that updates users table with a boolean flag indicating there's a message.
This approach is by far faster than counting indexes, be the index primary or secondary. When the user performs an action, you can update the users
table with has_messages = 0
, when a new message arrives - you update the table with has_messages = 1
. It's simple, it works, it scales and using triggers to maintain it makes it easy and seamless.
I'm sure there will be nay-sayers who don't like triggers, you can do it manually at the point of associating a user with a new message.
To have the information that someone has new messages - do exactly that. Update the field in users
table (I'm assuming that's the name) when a new message is recorded in the system. You have the recipient's ID, that's all you need. You can create an after insert
trigger (assumption: there's users2messages
table) that updates users table with a boolean flag indicating there's a message.
This approach is by far faster than counting indexes, be the index primary or secondary. When the user performs an action, you can update the users
table with has_messages = 0
, when a new message arrives - you update the table with has_messages = 1
. It's simple, it works, it scales and using triggers to maintain it makes it easy and seamless.
I'm sure there will be nay-sayers who don't like triggers, you can do it manually at the point of associating a user with a new message.
answered 1 hour ago
MjhMjh
1,96911112
1,96911112
triggers aside, looking up a row using the PK and also reading it to check the boolean is still more expensive than executing a single loose index scan. It gets worse when you also add a WHERE clause to check the boolean flag because of the low cardinality even if you index that field. Sorry to tell you you that, but you have a misunderstanding there.
– Kaii
1 hour ago
@Mjh i know about that.. but it's definitely more expensive than my suggested methods, because it contains (at least) 1x update + 1x select
– FeHora
55 mins ago
@KaiiSELECT has_messages FROM users WHERE id = 1;
is the fastest query there is. It's aneq_ref
which is infinitely faster than counting a number of records in the table. The boolean field is not in theWHERE
clause, the primary key is. Please, assume better next time. In regards to updating the table: the update is fast as well, it handles a single row located using the primary key. If the field is already containing the value that you're updating to, no actual disk I/O occurs and there's a minimal performance penalty. Much less than counting the records. You can measure.
– Mjh
41 mins ago
add a comment |
triggers aside, looking up a row using the PK and also reading it to check the boolean is still more expensive than executing a single loose index scan. It gets worse when you also add a WHERE clause to check the boolean flag because of the low cardinality even if you index that field. Sorry to tell you you that, but you have a misunderstanding there.
– Kaii
1 hour ago
@Mjh i know about that.. but it's definitely more expensive than my suggested methods, because it contains (at least) 1x update + 1x select
– FeHora
55 mins ago
@KaiiSELECT has_messages FROM users WHERE id = 1;
is the fastest query there is. It's aneq_ref
which is infinitely faster than counting a number of records in the table. The boolean field is not in theWHERE
clause, the primary key is. Please, assume better next time. In regards to updating the table: the update is fast as well, it handles a single row located using the primary key. If the field is already containing the value that you're updating to, no actual disk I/O occurs and there's a minimal performance penalty. Much less than counting the records. You can measure.
– Mjh
41 mins ago
triggers aside, looking up a row using the PK and also reading it to check the boolean is still more expensive than executing a single loose index scan. It gets worse when you also add a WHERE clause to check the boolean flag because of the low cardinality even if you index that field. Sorry to tell you you that, but you have a misunderstanding there.
– Kaii
1 hour ago
triggers aside, looking up a row using the PK and also reading it to check the boolean is still more expensive than executing a single loose index scan. It gets worse when you also add a WHERE clause to check the boolean flag because of the low cardinality even if you index that field. Sorry to tell you you that, but you have a misunderstanding there.
– Kaii
1 hour ago
@Mjh i know about that.. but it's definitely more expensive than my suggested methods, because it contains (at least) 1x update + 1x select
– FeHora
55 mins ago
@Mjh i know about that.. but it's definitely more expensive than my suggested methods, because it contains (at least) 1x update + 1x select
– FeHora
55 mins ago
@Kaii
SELECT has_messages FROM users WHERE id = 1;
is the fastest query there is. It's an eq_ref
which is infinitely faster than counting a number of records in the table. The boolean field is not in the WHERE
clause, the primary key is. Please, assume better next time. In regards to updating the table: the update is fast as well, it handles a single row located using the primary key. If the field is already containing the value that you're updating to, no actual disk I/O occurs and there's a minimal performance penalty. Much less than counting the records. You can measure.– Mjh
41 mins ago
@Kaii
SELECT has_messages FROM users WHERE id = 1;
is the fastest query there is. It's an eq_ref
which is infinitely faster than counting a number of records in the table. The boolean field is not in the WHERE
clause, the primary key is. Please, assume better next time. In regards to updating the table: the update is fast as well, it handles a single row located using the primary key. If the field is already containing the value that you're updating to, no actual disk I/O occurs and there's a minimal performance penalty. Much less than counting the records. You can measure.– Mjh
41 mins ago
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55581114%2fcountid-or-maxid-which-is-faster%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
I think you would be better off by adding a timestamp column and checking against that value to see if there are newer records.
– Dharman
2 hours ago
Either querying a timestamp or the ID, use
MAX()
on that column, and make sure it's indexed with(user_id, timestamp)
.– The Impaler
1 hour ago
@Dharman i was thinking of it. But it costs extra DB space, also i am not sure if it will be faster than one of my methods. I am storing the simple number (of current messages) in usernames table
– FeHora
1 hour ago
Calculate? No idea. But you can measure it. Fire off a few thousands of each query and watch machine metrics (cpu%, mem%, load average, etc.)
– Sergio Tulentsev
1 hour ago
While there is a good answer to this question below, I suspect you might be optimizing on something that turns out not to be important. And unless you anticipate having literally millions of messages, I wouldn't worry about disk space, especially because the timestamp is small compared to your other fields. If you add timestamps, your table will be about 5MB larger for each million messages. That's really nothing.
– Jerry
1 hour ago