Clustered Sequential GUID Primary Key vs Non-Clustered GUID and Clustered Sequential ID Primary Keys

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP

up vote
2
down vote

favorite

I’m trying to figure out what the optimal solution is in regards to size and performance when using GUIDs as a primary key. The main reason for wanting GUIDs is that it’s more appropriate in our micro-service architecture. There seems to be a lot of debate on what’s actually best.

The two main solutions appear to be:

  1. Have the primary key as a sequential GUID (generated by the
    application, rather than SQL Servers built in NEWSEQUENTIALID() as it isn’t truly sequential)
    with a clustered index.

  2. Have two primary keys, a randomly generated GUID with a
    non-clustered index and a basic int IDENTITY column with a
    clustered index — the approach is described best in this stackoverflow response here

I’m not sure which solution is best. The first method seems to be the most popular and widely used from what I’ve read. The second seems to have this benefit from that linked stackoverflow response:

Then there’s another issue to consider: the clustering key on a table will be added to each and every entry on each and every non-clustered index on your table as well – thus you really want to make sure it’s as small as possible.

Any advice would be appreciated.

share|improve this question

  • Having the GUID in the NC indexes can also be a benefit, if the queries that are driven by such an index also refers to the GUID. If you want to cover such a query, then the GUID is already in there.
    – Tibor Karaszi
    Nov 29 at 11:37

up vote
2
down vote

favorite

I’m trying to figure out what the optimal solution is in regards to size and performance when using GUIDs as a primary key. The main reason for wanting GUIDs is that it’s more appropriate in our micro-service architecture. There seems to be a lot of debate on what’s actually best.

The two main solutions appear to be:

  1. Have the primary key as a sequential GUID (generated by the
    application, rather than SQL Servers built in NEWSEQUENTIALID() as it isn’t truly sequential)
    with a clustered index.

  2. Have two primary keys, a randomly generated GUID with a
    non-clustered index and a basic int IDENTITY column with a
    clustered index — the approach is described best in this stackoverflow response here

I’m not sure which solution is best. The first method seems to be the most popular and widely used from what I’ve read. The second seems to have this benefit from that linked stackoverflow response:

Then there’s another issue to consider: the clustering key on a table will be added to each and every entry on each and every non-clustered index on your table as well – thus you really want to make sure it’s as small as possible.

Any advice would be appreciated.

share|improve this question

  • Having the GUID in the NC indexes can also be a benefit, if the queries that are driven by such an index also refers to the GUID. If you want to cover such a query, then the GUID is already in there.
    – Tibor Karaszi
    Nov 29 at 11:37

up vote
2
down vote

favorite

up vote
2
down vote

favorite

I’m trying to figure out what the optimal solution is in regards to size and performance when using GUIDs as a primary key. The main reason for wanting GUIDs is that it’s more appropriate in our micro-service architecture. There seems to be a lot of debate on what’s actually best.

The two main solutions appear to be:

  1. Have the primary key as a sequential GUID (generated by the
    application, rather than SQL Servers built in NEWSEQUENTIALID() as it isn’t truly sequential)
    with a clustered index.

  2. Have two primary keys, a randomly generated GUID with a
    non-clustered index and a basic int IDENTITY column with a
    clustered index — the approach is described best in this stackoverflow response here

I’m not sure which solution is best. The first method seems to be the most popular and widely used from what I’ve read. The second seems to have this benefit from that linked stackoverflow response:

Then there’s another issue to consider: the clustering key on a table will be added to each and every entry on each and every non-clustered index on your table as well – thus you really want to make sure it’s as small as possible.

Any advice would be appreciated.

share|improve this question

I’m trying to figure out what the optimal solution is in regards to size and performance when using GUIDs as a primary key. The main reason for wanting GUIDs is that it’s more appropriate in our micro-service architecture. There seems to be a lot of debate on what’s actually best.

The two main solutions appear to be:

  1. Have the primary key as a sequential GUID (generated by the
    application, rather than SQL Servers built in NEWSEQUENTIALID() as it isn’t truly sequential)
    with a clustered index.

  2. Have two primary keys, a randomly generated GUID with a
    non-clustered index and a basic int IDENTITY column with a
    clustered index — the approach is described best in this stackoverflow response here

I’m not sure which solution is best. The first method seems to be the most popular and widely used from what I’ve read. The second seems to have this benefit from that linked stackoverflow response:

Then there’s another issue to consider: the clustering key on a table will be added to each and every entry on each and every non-clustered index on your table as well – thus you really want to make sure it’s as small as possible.

Any advice would be appreciated.

sql-server performance index clustered-index uuid

share|improve this question

share|improve this question

share|improve this question

share|improve this question

asked Nov 29 at 11:14

Callum Breen

132

132

  • Having the GUID in the NC indexes can also be a benefit, if the queries that are driven by such an index also refers to the GUID. If you want to cover such a query, then the GUID is already in there.
    – Tibor Karaszi
    Nov 29 at 11:37

  • Having the GUID in the NC indexes can also be a benefit, if the queries that are driven by such an index also refers to the GUID. If you want to cover such a query, then the GUID is already in there.
    – Tibor Karaszi
    Nov 29 at 11:37

Having the GUID in the NC indexes can also be a benefit, if the queries that are driven by such an index also refers to the GUID. If you want to cover such a query, then the GUID is already in there.
– Tibor Karaszi
Nov 29 at 11:37

Having the GUID in the NC indexes can also be a benefit, if the queries that are driven by such an index also refers to the GUID. If you want to cover such a query, then the GUID is already in there.
– Tibor Karaszi
Nov 29 at 11:37

2 Answers
2

active

oldest

votes

up vote
2
down vote

accepted

If you’re going to use a GUID key, simply use a clustered index with NEWSEQENTIALID() and move on with your application. The alternative designs are simply not predictably and significantly better when you consider the extra data structures and complexity they introduce. So while careful testing of your actual application might reveal a superior alternative design, just start with a simple clustered index on a sequentially-generated GUID.

For the gory details on why it doesn’t really matter that the GUIDs are generated in the “middle” of the sort order, and the insertion point changes see: Good Page Splits and Sequential GUID Key Generation.

share|improve this answer

    up vote
    1
    down vote

    Using an INT (clustered) and a UNQUEIDENTIFIER (not) could save significant space if you have a lot of non-clustered indexes on the table, though be careful to make sure this internal value never leaks out into other layers of your application, and make sure it is well documented so others working on the DB know, as your distributed system could accidentally gain a global dependency on what should be a local only value.

    There is a third option to consider: do you have anything else in the table that might benefit from being the clustering key? Perhaps a DATE column that sees a lot of range queries? Range queries on that column would then see a performance increase, you wouldn’t be subject to the fragmentation see with arbitrary UUIDs and the date+uniquifier while still larger than an integer at 8 bytes is still smaller than the 16 needed for a UUID. Even if it is a DATETIME, or DATETIME2 depending on precision, the result is 12 bytes, so if you are using system managed history tables and this is a base table perhaps consider using sysStartTime.

    Even if all you have is an INT (or smaller) that doesn’t vary a lot, that plus the uniquifier will be a smaller key than a UUID and the uniquifier is effectively an INT IDENTITY(1,1) so it’ll behave like your extra INT colummn without needing to pollute the data model with an extra column existing only for size/performance reasons.

    (generated by the application, rather than SQL Servers built in NEWSEQUENTIALID() as it isn’t truly sequential)

    It is “sequential enough” to vastly reduce any fragmentation concern though, so don’t go out of your way re-inventing the wheel if that is your only issue.

    share|improve this answer

    • 1

      The key takeaway from the first sentence is that your primary key doesn’t have to be clustered and you can have multiple “keys” that are enforced through unique constraints/indexes instead of the primary key syntax. The end result is largely the same.
      – Aaron Bertrand
      Nov 29 at 13:25

    Your Answer

    StackExchange.ready(function() {
    var channelOptions = {
    tags: “”.split(” “),
    id: “182”
    };
    initTagRenderer(“”.split(” “), “”.split(” “), channelOptions);

    StackExchange.using(“externalEditor”, function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using(“snippets”, function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: ‘answer’,
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: “”,
    imageUploader: {
    brandingHtml: “Powered by u003ca class=”icon-imgur-white” href=”https://imgur.com/”u003eu003c/au003e”,
    contentPolicyHtml: “User contributions licensed under u003ca href=”https://creativecommons.org/licenses/by-sa/3.0/”u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href=”https://stackoverflow.com/legal/content-policy”u003e(content policy)u003c/au003e”,
    allowUrls: true
    },
    onDemand: true,
    discardSelector: “.discard-answer”
    ,immediatelyShowMarkdownHelp:true
    });

    }
    });

    draft saved
    draft discarded

    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin(‘.new-post-login’, ‘https%3a%2f%2fdba.stackexchange.com%2fquestions%2f223737%2fclustered-sequential-guid-primary-key-vs-non-clustered-guid-and-clustered-sequen%23new-answer’, ‘question_page’);
    }
    );

    Post as a guest

    Required, but never shown

    2 Answers
    2

    active

    oldest

    votes

    2 Answers
    2

    active

    oldest

    votes

    active

    oldest

    votes

    active

    oldest

    votes

    up vote
    2
    down vote

    accepted

    If you’re going to use a GUID key, simply use a clustered index with NEWSEQENTIALID() and move on with your application. The alternative designs are simply not predictably and significantly better when you consider the extra data structures and complexity they introduce. So while careful testing of your actual application might reveal a superior alternative design, just start with a simple clustered index on a sequentially-generated GUID.

    For the gory details on why it doesn’t really matter that the GUIDs are generated in the “middle” of the sort order, and the insertion point changes see: Good Page Splits and Sequential GUID Key Generation.

    share|improve this answer

      up vote
      2
      down vote

      accepted

      If you’re going to use a GUID key, simply use a clustered index with NEWSEQENTIALID() and move on with your application. The alternative designs are simply not predictably and significantly better when you consider the extra data structures and complexity they introduce. So while careful testing of your actual application might reveal a superior alternative design, just start with a simple clustered index on a sequentially-generated GUID.

      For the gory details on why it doesn’t really matter that the GUIDs are generated in the “middle” of the sort order, and the insertion point changes see: Good Page Splits and Sequential GUID Key Generation.

      share|improve this answer

        up vote
        2
        down vote

        accepted

        up vote
        2
        down vote

        accepted

        If you’re going to use a GUID key, simply use a clustered index with NEWSEQENTIALID() and move on with your application. The alternative designs are simply not predictably and significantly better when you consider the extra data structures and complexity they introduce. So while careful testing of your actual application might reveal a superior alternative design, just start with a simple clustered index on a sequentially-generated GUID.

        For the gory details on why it doesn’t really matter that the GUIDs are generated in the “middle” of the sort order, and the insertion point changes see: Good Page Splits and Sequential GUID Key Generation.

        share|improve this answer

        If you’re going to use a GUID key, simply use a clustered index with NEWSEQENTIALID() and move on with your application. The alternative designs are simply not predictably and significantly better when you consider the extra data structures and complexity they introduce. So while careful testing of your actual application might reveal a superior alternative design, just start with a simple clustered index on a sequentially-generated GUID.

        For the gory details on why it doesn’t really matter that the GUIDs are generated in the “middle” of the sort order, and the insertion point changes see: Good Page Splits and Sequential GUID Key Generation.

        share|improve this answer

        share|improve this answer

        share|improve this answer

        edited Nov 29 at 15:36

        answered Nov 29 at 14:04

        David Browne – Microsoft

        10.1k725

        10.1k725

            up vote
            1
            down vote

            Using an INT (clustered) and a UNQUEIDENTIFIER (not) could save significant space if you have a lot of non-clustered indexes on the table, though be careful to make sure this internal value never leaks out into other layers of your application, and make sure it is well documented so others working on the DB know, as your distributed system could accidentally gain a global dependency on what should be a local only value.

            There is a third option to consider: do you have anything else in the table that might benefit from being the clustering key? Perhaps a DATE column that sees a lot of range queries? Range queries on that column would then see a performance increase, you wouldn’t be subject to the fragmentation see with arbitrary UUIDs and the date+uniquifier while still larger than an integer at 8 bytes is still smaller than the 16 needed for a UUID. Even if it is a DATETIME, or DATETIME2 depending on precision, the result is 12 bytes, so if you are using system managed history tables and this is a base table perhaps consider using sysStartTime.

            Even if all you have is an INT (or smaller) that doesn’t vary a lot, that plus the uniquifier will be a smaller key than a UUID and the uniquifier is effectively an INT IDENTITY(1,1) so it’ll behave like your extra INT colummn without needing to pollute the data model with an extra column existing only for size/performance reasons.

            (generated by the application, rather than SQL Servers built in NEWSEQUENTIALID() as it isn’t truly sequential)

            It is “sequential enough” to vastly reduce any fragmentation concern though, so don’t go out of your way re-inventing the wheel if that is your only issue.

            share|improve this answer

            • 1

              The key takeaway from the first sentence is that your primary key doesn’t have to be clustered and you can have multiple “keys” that are enforced through unique constraints/indexes instead of the primary key syntax. The end result is largely the same.
              – Aaron Bertrand
              Nov 29 at 13:25

            up vote
            1
            down vote

            Using an INT (clustered) and a UNQUEIDENTIFIER (not) could save significant space if you have a lot of non-clustered indexes on the table, though be careful to make sure this internal value never leaks out into other layers of your application, and make sure it is well documented so others working on the DB know, as your distributed system could accidentally gain a global dependency on what should be a local only value.

            There is a third option to consider: do you have anything else in the table that might benefit from being the clustering key? Perhaps a DATE column that sees a lot of range queries? Range queries on that column would then see a performance increase, you wouldn’t be subject to the fragmentation see with arbitrary UUIDs and the date+uniquifier while still larger than an integer at 8 bytes is still smaller than the 16 needed for a UUID. Even if it is a DATETIME, or DATETIME2 depending on precision, the result is 12 bytes, so if you are using system managed history tables and this is a base table perhaps consider using sysStartTime.

            Even if all you have is an INT (or smaller) that doesn’t vary a lot, that plus the uniquifier will be a smaller key than a UUID and the uniquifier is effectively an INT IDENTITY(1,1) so it’ll behave like your extra INT colummn without needing to pollute the data model with an extra column existing only for size/performance reasons.

            (generated by the application, rather than SQL Servers built in NEWSEQUENTIALID() as it isn’t truly sequential)

            It is “sequential enough” to vastly reduce any fragmentation concern though, so don’t go out of your way re-inventing the wheel if that is your only issue.

            share|improve this answer

            • 1

              The key takeaway from the first sentence is that your primary key doesn’t have to be clustered and you can have multiple “keys” that are enforced through unique constraints/indexes instead of the primary key syntax. The end result is largely the same.
              – Aaron Bertrand
              Nov 29 at 13:25

            up vote
            1
            down vote

            up vote
            1
            down vote

            Using an INT (clustered) and a UNQUEIDENTIFIER (not) could save significant space if you have a lot of non-clustered indexes on the table, though be careful to make sure this internal value never leaks out into other layers of your application, and make sure it is well documented so others working on the DB know, as your distributed system could accidentally gain a global dependency on what should be a local only value.

            There is a third option to consider: do you have anything else in the table that might benefit from being the clustering key? Perhaps a DATE column that sees a lot of range queries? Range queries on that column would then see a performance increase, you wouldn’t be subject to the fragmentation see with arbitrary UUIDs and the date+uniquifier while still larger than an integer at 8 bytes is still smaller than the 16 needed for a UUID. Even if it is a DATETIME, or DATETIME2 depending on precision, the result is 12 bytes, so if you are using system managed history tables and this is a base table perhaps consider using sysStartTime.

            Even if all you have is an INT (or smaller) that doesn’t vary a lot, that plus the uniquifier will be a smaller key than a UUID and the uniquifier is effectively an INT IDENTITY(1,1) so it’ll behave like your extra INT colummn without needing to pollute the data model with an extra column existing only for size/performance reasons.

            (generated by the application, rather than SQL Servers built in NEWSEQUENTIALID() as it isn’t truly sequential)

            It is “sequential enough” to vastly reduce any fragmentation concern though, so don’t go out of your way re-inventing the wheel if that is your only issue.

            share|improve this answer

            Using an INT (clustered) and a UNQUEIDENTIFIER (not) could save significant space if you have a lot of non-clustered indexes on the table, though be careful to make sure this internal value never leaks out into other layers of your application, and make sure it is well documented so others working on the DB know, as your distributed system could accidentally gain a global dependency on what should be a local only value.

            There is a third option to consider: do you have anything else in the table that might benefit from being the clustering key? Perhaps a DATE column that sees a lot of range queries? Range queries on that column would then see a performance increase, you wouldn’t be subject to the fragmentation see with arbitrary UUIDs and the date+uniquifier while still larger than an integer at 8 bytes is still smaller than the 16 needed for a UUID. Even if it is a DATETIME, or DATETIME2 depending on precision, the result is 12 bytes, so if you are using system managed history tables and this is a base table perhaps consider using sysStartTime.

            Even if all you have is an INT (or smaller) that doesn’t vary a lot, that plus the uniquifier will be a smaller key than a UUID and the uniquifier is effectively an INT IDENTITY(1,1) so it’ll behave like your extra INT colummn without needing to pollute the data model with an extra column existing only for size/performance reasons.

            (generated by the application, rather than SQL Servers built in NEWSEQUENTIALID() as it isn’t truly sequential)

            It is “sequential enough” to vastly reduce any fragmentation concern though, so don’t go out of your way re-inventing the wheel if that is your only issue.

            share|improve this answer

            share|improve this answer

            share|improve this answer

            answered Nov 29 at 11:42

            David Spillett

            21.9k23167

            21.9k23167

            • 1

              The key takeaway from the first sentence is that your primary key doesn’t have to be clustered and you can have multiple “keys” that are enforced through unique constraints/indexes instead of the primary key syntax. The end result is largely the same.
              – Aaron Bertrand
              Nov 29 at 13:25

            • 1

              The key takeaway from the first sentence is that your primary key doesn’t have to be clustered and you can have multiple “keys” that are enforced through unique constraints/indexes instead of the primary key syntax. The end result is largely the same.
              – Aaron Bertrand
              Nov 29 at 13:25

            1

            1

            The key takeaway from the first sentence is that your primary key doesn’t have to be clustered and you can have multiple “keys” that are enforced through unique constraints/indexes instead of the primary key syntax. The end result is largely the same.
            – Aaron Bertrand
            Nov 29 at 13:25

            The key takeaway from the first sentence is that your primary key doesn’t have to be clustered and you can have multiple “keys” that are enforced through unique constraints/indexes instead of the primary key syntax. The end result is largely the same.
            – Aaron Bertrand
            Nov 29 at 13:25

            draft saved
            draft discarded

            Thanks for contributing an answer to Database Administrators Stack Exchange!

            • Please be sure to answer the question. Provide details and share your research!

            But avoid

            • Asking for help, clarification, or responding to other answers.
            • Making statements based on opinion; back them up with references or personal experience.

            To learn more, see our tips on writing great answers.

            Some of your past answers have not been well-received, and you’re in danger of being blocked from answering.

            Please pay close attention to the following guidance:

            • Please be sure to answer the question. Provide details and share your research!

            But avoid

            • Asking for help, clarification, or responding to other answers.
            • Making statements based on opinion; back them up with references or personal experience.

            To learn more, see our tips on writing great answers.

            draft saved

            draft discarded

            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin(‘.new-post-login’, ‘https%3a%2f%2fdba.stackexchange.com%2fquestions%2f223737%2fclustered-sequential-guid-primary-key-vs-non-clustered-guid-and-clustered-sequen%23new-answer’, ‘question_page’);
            }
            );

            Post as a guest

            Required, but never shown

            Required, but never shown

            Required, but never shown

            Required, but never shown

            Required, but never shown

            Required, but never shown

            Required, but never shown

            Required, but never shown

            Required, but never shown

            Related Post

            Leave a Reply

            Your email address will not be published. Required fields are marked *