Recipies

Migration cross-check

Situation

You’ve migrated a Namespace, a Tenant or an entire system to another HCP, using replication, and you need to have a verification that the data in source and target is exactly the same.

Recipe

This example will use a single Tenant as an example.

Acquire a list of existing objects from both HCP systems

  • Use the hcpmqe tool to query both HCP systems for a list of existing objects:

    • Use a Tenant user with Search permission for all Namespaces within the Tenant

    • Select create as the only transaction type

    • Leave Start time at the default, set End Time to when you finished the migration

    • Select sqlite3 as output format

    • Check verbose

    • Run the query for both involved HCP systems

    _images/recipe01_80.png _images/recipe01_85.png
    • You should have two database files, once finished:

      $ ls -lh *.db
      -rw-r--r--  1 tsimons  staff   2.4M Nov 24 17:03 awhdis2_hcp80.db
      -rw-r--r--  1 tsimons  staff   2.4M Nov 24 09:04 awhdis2_hcp85.db
      

Note

For a comparison like this, just a few of the columns in the databases are relevant to clearly identify an object:

  • hash

  • ingesttime

  • namespace

  • objectPath

  • version

Some more are interesting, as well:

  • replicated

  • size

  • Use the sqlite3 commandline tool to run SQL queries to compare the two databases:

    Note

    For a valid result, make sure to limit the set of objects investigated to exactly the same time frame - we’ll use the epoch time stamp (seconds since 1970/1/1 0:00:00) for that - you can use this to convert.

    For this example, migration ended 2020/11/23 08:00:00 –> 1606114800 epoch time.

    • Open the origin HCP database (awhdis2_hcp80.db):

      $ sqlite3 awhdis2_hcp80.db
      
    • Attach the migration target HCP database (awhdis2_hcp85.db):

      sqlite> ATTACH 'awhdis2_hcp85.db' AS replica;
      
    • Check if the no. of records are equal:

      sqlite> SELECT count(*) FROM main.ops
                  WHERE ingestTime <= 1606114800;
      4673
      sqlite> SELECT count(*) FROM replica.ops
                  WHERE ingestTime <= 1606114800;
      4673
      
    • Check if there are any non-replicated objects:

      sqlite> SELECT count(*) FROM main.ops
                  WHERE NOT replicated
                        AND ingestTime <= 1606114800;
      0
      sqlite> SELECT count(*) FROM replica.ops
                  WHERE NOT replicated
                        AND ingestTime <= 1606114800;
      0
      
    • Now, lets check which records don’t exist in one of the databases:

      • List all records not in the migration target database:

        sqlite> SELECT hash, ingesttime, namespace,
                       objectPath, version FROM main.ops
                    WHERE ingestTime <= 1606114800
                    EXCEPT SELECT hash, ingesttime, namespace,
                                  objectPath, version
                                FROM replica.ops
                                    WHERE ingestTime <= 1606114800;
        [..]
        
      • List all records not in the origin database:

        sqlite> SELECT hash, ingesttime, namespace,
                       objectPath, version FROM replica.ops
                    WHERE ingestTime <= 1606114800
                    EXCEPT SELECT hash, ingesttime, namespace,
                                  objectPath, version
                                FROM main.ops
                                    WHERE ingestTime <= 1606114800;
        [..]
        

      Alternative way to achieve the same result:

      • List all records not in the migration target database:

        sqlite> SELECT DISTINCT hash, ingesttime, namespace,
                                objectPath, version FROM main.ops
                    WHERE ingestTime <= 1606114800
                        AND (hash, ingesttime, namespace, objectPath,
                             version) NOT IN
                        (SELECT DISTINCT hash, ingesttime, namespace,
                                         objectPath, version
                            FROM replica.ops
                                WHERE ingestTime <= 1606114800);
        
      • List all records not in the origin database:

        sqlite> SELECT DISTINCT hash, ingesttime, namespace,
                                objectPath, version FROM replica.ops
                    WHERE ingestTime <= 1606114800
                        AND (hash, ingesttime, namespace, objectPath,
                            version) NOT IN
                        (SELECT DISTINCT hash, ingesttime, namespace,
                                         objectPath, version
                            FROM main.ops
                                WHERE ingestTime <= 1606114800);