MERGE - GoToAssist Monitoring

Find an Answer

Search GoToAssist Monitoring articles, videos and user guides   Your search term must have 2 or more characters.

MERGE

Syntax

MERGE statements are a hybrid tool for inserting or updating information to the database in situations where GoToAssist needs to make an intelligent decision about which data to insert, which to update (replace), and which data to leave unchanged.

The MERGE statement identifies the node or nodes in question, and then looks for it. If it finds it, it adds any missing information, or updates any obsolete information. If it doesn't find the node, it creates it. You can use MERGE statements to add nodes that were not originally discovered.

Think of a MERGE statement as a tree of nested condition statements. Each condition statement identifies a node or a set of nodes that match the condition. Going down the tree, condition statements are applied one by one. After each condition statement is satisfied, the children of that condition statement are then applied, making the statement more and more specific.

Any single condition statement may of course match more than one node. Make sure that you construct your MERGE statement with that in mind. If you want to merge data into a single node in your PQL database, make sure that your condition statements identify only that node and filter out all others.

Caution: Be careful using MERGE statements. Because a MERGE statement replaces existing data with new data, it's possible to accidently delete or overwrite existing data by using such innocuous-looking statements as these two:

MERGE INTO / VALUES {};
MERGE @'timestamp' INTO / VALUES {};

The possible elements of an MERGE statement include:

pql_statement 	             ::= 	MERGE [INTO] [@timestamp] path 
                                        VALUES conditional_tree_literal
		
timestamp 	              ::= 	'yyyy-mm-ddThh:mm:dd.uuuuuuZ'
conditional_tree_literal     ::=    	
conditional_tree_assignment  ::= 	{conditional_tree_assignment [,conditional_tree_assignment]}
                                       identifier[boolean_expression] => expression
                                       | identifier[boolean_expression] => tree_literal
                                       | identifier => expression
                                       | identifier => tree_literal 

Description

It is important to understand when and why the MERGE statement updates (replaces) data, and when it simply inserts data, because this statement can accidently delete your entire tree if used incorrectly.

The MERGE rules are as follows:

  • Rule 1: If no conditions exist, then all existing nodes with the same name are replaced by the new node. You can add a TRUE condition for a node that otherwise has no conditions — such as the top-level network node — to prevent the entire tree from being replaced. See example of Rule 1.
  • Rule 2: If a condition exists but is not met by any node, then a new node is created. You can force new node creation by adding a FALSE condition for nodes that you always want replaced, such as logs. See example of Rule 2.
  • Rule 3: If a condition exists and is met by multiple nodes, then all matching nodes are updated.

If there are no conditions on a node because the Crawler is reporting all known information about the node, then MERGE updates (replaces), rather than inserts this information. If the server finds no conditions, it replaces all existing nodes that have the same name. You can avoid Rule 1 by adding a condition to exclude a particular node from this rule, where the node has no other conditions to distinguish it such as the network node, by adding a TRUE condition.

The elements of an MERGE statement are defined as follows:

MERGE [INTO]  

The MERGE statement is a hybrid between INSERT and UPDATE, for situations where the GoToAssist Crawler discovers new data and needs to make intelligent decisions about how to merge it with existing data. For example, if either the new or the old data is incomplete, then the server must decide which data to insert, which data to replace or update, and which data to keep unchanged.

The INTO keyword is optional.

The MERGE clause must include a path to indicate the location of the data to be added or replaced, and can include a [@timestamp] option to add or replace history. A path can be represented by an identifier or a boolean_expression.

The [@timestamp] option updates history, and is represented by the string: 'yyyy-mm-ddThh:mm:dd.uuuuuuZ', where uuuuuu is the fraction of a second in microseconds. For example: 2007-09-02T16:14:52.171176Z .

VALUES  

Indicates the data to be added or replaced, and must be specified with a conditional_tree_literal.

A conditional_tree_literal is one or more conditional_tree_assignments. A conditional_tree_assignment can be any tree_assignment with additional conditions.

Examples

You can try the following examples yourself against fictitious data in the Query Sandbox:

Example 1: Simple merge

Suppose you want to add some information to the database, but are not sure whether it is actually missing or not. Some of the data is new, and some must update (replace) existing data. You could write an MERGE statement that specifies that where b is 2, a should be inserted, and where b is not 2, b should be updated to 3, like this:

  MERGE INTO / values {
    a[b='2'] => {
      b => '3'
    }
  }
  

This statement starts with the MERGE INTO keywords, indicating your intention to insert and/or update data. Next is a slash / to represent the root node where you are taking action. The VALUES keyword follows, and then the specific values themselves.

The GoToAssist Crawler specifies information to the server in the form of a complete tree. The server then merges the data with the existing forest. The new trees include conditional paths, and the conditions specify how they should merge with the existing forest.

The resulting output might look something like this:

  row: 
    *: 
     a:
      b: 3

Example 2: Compare INSERT to MERGE

It might be helpful to compare a MERGE statement with a similar INSERT statement. The following INSERT statement simply inserts the {a => 'b'} values into the root:

INSERT INTO / values {    MERGE INTO / values { 
  a => 'b'                 a[b = '2'] => {
}			      b => '3'
				} 
                           }

The MERGE statement inserts and/or overrides data, and the conditional path: {a[b = '2'] => {b => '3'}} indicates what should be inserted and what should be updated. In this case, where b = 2, a is inserted, and where b != 2, b should be updated to 3.

The resulting output is the same as in Example 1.

Example 3: Prevent replacing the entire database

Let's use a more realistic example. Suppose the GoToAssist Crawler discovers a new device, and that it already has information about one of the interfaces on that device. The Crawler sends a statement like this to the Search Index:

 MERGE INTO / values {
   network[TRUE] => {
     device[interface/mac_address = '00:00' or 
            interface/ip_address = '1.2.3.4'] => {
       interface => {
         name => 'eth0', 
         mac_address => '00:B1:B2:B3:B4:B5', 
         ip_address => '10.10.10.40'
       }
     }
   }
 }

The device node has a condition that specifies which existing node should be updated. As we know from Rule 2, if the condition does not match any existing nodes, then a new node is created.

The interface node has no conditions. As we know from Rule 1, nodes without conditions are updated (completely replaced). If the server finds no conditions, it replaces all existing nodes with that name.

Note that the network node has a condition of [TRUE]. That is because the top-level network node normally has no conditions. Rule 1 completely replaces nodes without conditions. So [TRUE] is used as an exception to Rule 1. If we do not use this exception, every update would replace the network node, which is equivalent to replacing the entire tree.

If this statement is run on the sample database, it produces the following results:

              row: 
    *: 
      network: 
        device: 
          interface: 
            ip_address: 10.10.10.40
            mac_address: 00:B1:B2:B3:B4:B5    
            name: eth0


        

Example 4: Merge processes

In this example, the Crawler reports the set of running processes for a device:

MERGE INTO / values {
   network[TRUE] => {
     device[uuid = '1'] => {
       running_processes => {
         win32_process  => 'foo', 
         win32_process => 'bar'
       }
     }
   }
 }

Note that the network node has a condition of [TRUE] to avoid (Rule 1) replacing the entire tree. This statement produces the following results:

 row: 
    *: 
      network: 
        device: 
          running_processes: 
            win32_process: foo
            win32_process: bar

Example 5: Update disk space

In this example, the free disk space is updated, but only for the disk called '/dev/hda':

MERGE INTO / values {
   network[TRUE] => {
     device[uuid = '1'] => {
       disk[name = '/dev/hda'] => {
         free_space => 10000 
       }
     }
   }
 }

This statement would produces results like this:

  row: 
    *: 
      network: 
        device: 
          disk: 
            free_space: 10000

Example 6: Force new node creation

For nodes that are never replaced, such as log messages, we need a way to force Rule 2 and ensure new node creation. You can do that by adding a [FALSE] condition so that new nodes are added instead of being updated (replaced). For that case, you can use:

MERGE INTO / values {
   log[FALSE] => {
     message => 'Service started', 
     timestamp => '1/2/07'
   }
 }
 

This statement would produces results like this:

  row: 
    *: 
     log: 
       message: Service started
       timestamp: 1/2/07

Example 7: Merge historical data

This example merges historical values by adding a timestamp. Note that the timestamp must be in single quotes:

'2008-03-21T15:50:01.528935Z'.

Note the use of an indented format to write a good MERGE statement that can be read by human eyes. The first 5 lines of the following more complex statement are boilerplate. They identify the node and its location. From the sixth line and below, the details about that node are listed:

Note: GoToAssist timestamps are in coordinated universal time (UTC/GMT). When you view your data in your GoToAssist Web account, GoToAssist converts your data to your local time. But when you retrieve timestamps from your database as results to your searches or PQL queries, it is in UTC.

MERGE@'2008-03-21T15:50:01.528935Z' into / values {
  network[true] => {
    device[interface/mac_address='00:13:20:40:9A:9E'] => {
      system[true] => {
        meminfo[true] => {
          SwapTotal[true] => '3997474816',
          SwapFree[true] => '3548553216',
          MemFree[true] => '40374272',
          HighTotal[true] => '1198030848',
          Committed_AS[true] => '1972723712',
          SUnreclaim[true] => '19660800',
          NFS_Unstable[true] => '0', 
          VmallocChunk[true] => '64999424', 
          Writeback[true] => '0', 
          MemTotal[true] => '2116120576', 
          VmallocUsed[true] => '47677440', 
          AnonPages[true] => '1147985920', 
          Active[true] => '1293049856', 
          CommitLimit[true] => '5055533056', 
          Cached[true] => '475234304', 
          SwapCached[true] => '247386112', 
          LowTotal[true] => '918089728', 
          Dirty[true] => '24576', 
          Mapped[true] => '58638336', 
          HighFree[true] => '30834688', 
          SReclaimable[true] => '64770048', 
          VmallocTotal[true] => '117432320', 
          Bounce[true] => '0', 
          Inactive[true] => '676634624', 
          PageTables[true] => '3608576', 
          Slab[true] => '84430848', 
          Buffers[true] => '346124288', 
          LowFree[true] => '9539584'
        }
      }
    }
  }
}

This statement would produces results like this:

row: 
  *: 
    network: 
      device: 
        system: 
          meminfo: 
            lowfree: 9539584
            buffers: 346124288
            slab: 84430848
            pagetables: 3608576
            inactive: 676634624
            bounce: 0
            vmalloctotal: 117432320
            sreclaimable: 64770048
            highfree: 30834688
            mapped: 58638336
            dirty: 24576
            lowtotal: 918089728
            swapcached: 247386112
            cached: 475234304
            commitlimit: 5055533056
            active: 1293049856
            anonpages: 1147985920
            vmallocused: 47677440
            memtotal: 2116120576
            writeback: 0
            vmallocchunk: 64999424
            nfs_unstable: 0
            sunreclaim: 19660800
            committed_as: 1972723712
            hightotal: 1198030848
            memfree: 40374272
            swapfree: 3548553216
            swaptotal: 3997474816

Example 8: Merge without destroying

This example merges device data.

MERGE INTO / values {
  network[true] => {
    device[interface/mac_address = '00:E0:81:58:FE:CA' 
    or interface/inet/ip_address = '50.10.33.64'] => {
      interface[mac_address = '00:0C:29:93:CE:EE' or 
         inet/ip_address = '10.10.20.27'] => {
           mac_address[true] => '00:0C:29:93:CE:EE',
           inet[ip_address = '10.10.20.27'] => {
           last_seen[true] => '2008-03-19T18:42:10.0000000Z'
           }
         }
      }
   }
}

This statement produces the following results:

row: 
row
  *
    network
      device
        interface
          inet
            last_seen 2008-02-19T18:42:10.0000000Z
          mac_address 00:0C:29:93:CE:EE
      device
        interface
          inet
            last_seen 2008-03-19T18:42:10.0000000Z
          mac_address 00:0C:29:93:CE:EE

Example 9: Complex merge

In reality, your MERGE statements may be far more complex than these examples. To add multiple nodes to your database that were not previously discovered, you may want to use a different MERGE statement for each node. Here's a more complex and more realistic example:

ERGE@'2008-03-21T17:15:12.014103Z' into / values {
 network[true] => {
  device[interface/mac_address = '00:0C:29:93:CE:EE' or 
        or interface/inet/ip_address = '10.10.20.27'] => {
    interface[mac_address = '00:0C:29:93:CE:EE' or 
           or inet/ip_address = '10.10.20.27'] => {
      mac_address[true] => '00:0C:29:93:CE:EE',
      inet[ip_address = '10.10.20.27'] => {
        ip_address[true] => '10.10.20.27'
      }
    },
    system[true] => {
      dns_name[true] => 'maruichi'
    },
    evidence => {
      dhcp[false] => {
        value[true] => {
          vendor_class[true] => 'MSFT 5.0'
        }, 
        value[true] => {
          host_name[true] => 'maruichi'
        },
        value[true] => {
          param_request_list[true] => 
          '1,15,3,6,44,46,47,31,33,43,252'
        }
      }, 
      dns_name[false] => {
        value[true] => {
          value[true] => 'maruichi'}
      }, 
      ip_address[false] => {
        value[true] => {
          value[true] => '10.10.20.27'
        }
      }, 
      mac_address[false] => {
        value[true] => {
          value[true] => '00:0C:29:93:CE:EE'
        }
      }, 
      probe[false] => {
        param[true] => {
          technique[true] => 'snmp'
        }, 
        param[true] => {
          port[true] => '161'
        },
        param[true] => {
          request_label[true] => 'snmp_sysdescr'
        }, 
        param[true] => {
          request[true] => '1.3.6.1.2.1.1.1.0'
        }, 
        value[true] => {
          response[true] => 'Hardware: x86 Family 6 Model 
                             15 Stepping 8 AT/AT COMPATIBLE 
                             - Software: Windows 2000 
                             Version 5.0 (Build 2195 
                             Uniprocessor Free)'
        }
      },
      probe[false] => {
        param[true] => {
          technique[true] => 'snmp'
        }, 
        param[true] => {
          port[true] => '161'
        }, 
        param[true] => {
          request_label[true] => 'snmp_sysname'
        }, 
        param[true] => {
          request[true] => '1.3.6.1.2.1.1.5.0'
        },
        value[true] => {
          response[true] => 'MARUICHI'
        }
      }, 
      scan_status[false] => {
        param[true] => {
          scan[true] => 'SCAN_DNS_QUERY'
        }, 
        value[true] => {
          value[true] => 'completed'
        }
      }, 
      scan_status[false] => {
        param[true] => {
          scan[true] => 'SCAN_NBNS_QUERY'
        }, 
        value[true] => {
          value[true] => 'completed'
        }
      }, 
      scan_status[false] => {
        param[true] => {
          scan[true] => 'SCAN_ARP_QUERY'
        }, 
        value[true] => {
          value[true] => 'completed'
        }
      },
      scan_status[false] => {
        param[true] => {
          scan[true] => 'SCAN_TCP_PORTS'
        },
        value[true] => {
          value[true] => 'completed'
        }
      }, 
      scan_status[false] => {
        param[true] => {
          scan[true] => 'SCAN_UDP_PORTS'
        }, 
        value[true] => {
          value[true] => 'completed'
        }
      }, 
      scan_status[false] => {
        param[true] => {
          scan[true] => 'SCAN_SNMP'
        }, 
        value[true] => {
          value[true] => 'completed'
        }
      }, 
      subnet[false] => {
        value[true] => {
          value[true] => '10.10.0.0/16'
        }
      }, 
      tcp_port[false] => {
        param[true] => {
          port[true] => '135'
        }, 
        value[true] => {
          value[true] => 'open'
        }
      }, 
      tcp_port[false] => {
        param[true] => {
          port[true] => '139'
        }, 
        value[true] => {
          value[true] => 'open'
        }
      }, 
      tcp_port[false] => {
        param[true] => {
          port[true] => '7'
        },
          value[true] => {
            value[true] => 'open'
          }
      }
    }
  }
 }
}

This statement would produces results like this:

row: 
 *: 
  network: 
   device: 
    interface: 
     mac_address: 00:0C:29:93:CE:EE
     inet: 
      ip_address: 10.10.20.27
    system: 
     dns_name: maruichi
    evidence: 
     mac_address: 
      value: 
       value: 00:0C:29:93:CE:EE
     ip_address: 
      value: 
       value: 10.10.20.27
     tcp_port: 
      value: 
       value: open
      param: 
       port: 7
     tcp_port: 
      value: 
       value: open
      param: 
       port: 139
     tcp_port: 
      value: 
       value: open
      param: 
       port: 135
     subnet: 
      value: 
       value: 10.10.0.0/16
     scan_status: 
      value: 
       value: completed
      param: 
       scan: SCAN_SNMP
     scan_status: 
      value: 
       value: completed
      param: 
       scan: SCAN_UDP_PORTS
     scan_status: 
      value: 
       value: completed
      param: 
       scan: SCAN_TCP_PORTS
     scan_status: 
      value: 
       value: completed
      param: 
       scan: SCAN_ARP_QUERY
     scan_status: 
      value: 
       value: completed
      param: 
       scan: SCAN_NBNS_QUERY
     scan_status: 
      value: 
       value: completed
      param: 
       scan: SCAN_DNS_QUERY
     probe: 
      value: 
       response: MARUICHI
      param: 
       request: 1.3.6.1.2.1.1.5.0
      param: 
       request_label: snmp_sysname
      param: 
       port: 161
      param: 
       technique: snmp
     probe: 
      value: 
       response: Hardware: x86 Family 6 Model 15 
                 Stepping 8 AT/AT COMPATIBLE - 
                 Software: Windows 2000 Version 5.0 
                 (Build 2195 Uniprocessor Free)
      param: 
       request: 1.3.6.1.2.1.1.1.0
      param: 
       request_label: snmp_sysdescr
      param: 
       port: 161
      param: 
       technique: snmp
     dns_name: 
      value: 
       value: maruichi
     dhcp: 
      value: 
       param_request_list: 1,15,3,6,44,46,47,31,33,43,252
      value: 
       host_name: maruichi
      value: 
       vendor_class: MSFT 5.0

Example 10: Use Node IDs in MERGE statements

You can use Node IDs in MERGE statements, in place of lengthy condition statements in the FROM clause. Many typical MERGE statements include a lengthy path that identifies which node or set of nodes to merge the new data into, such as: MERGE INTO / . . . where the slash / represents a path that roots the first clause of the VALUES tree. Node IDs cut that path down to just a number that uniquely identifies a single node.

For example, say you have 3 alerts in your database and you want to update the first one. To merge data into Alert 1, you must write a MERGE statement that identifies Alert 1, but filters out Alerts 2 and 3. To do this without using Node ID, you'd write a MERGE statement with some conditions that identify Alert 1, like this:

MERGE INTO / VALUES {
  apps[true] => {
    com[true] => {
      alerts[true] =>
      {
      alert[name='Alert 1'] => 
        {name[true] => 'Renamed Alert 1' 
        }
      }
    }
  }
}

Instead, you can use the Node ID of Alert 1 to identify it. To find the Node ID of any particular node, run a query like this:

Query: What is the Node ID of Alert 1?

SELECT /#id 
FROM /apps/com/paglo/alerts/alert 
WHERE name = 'Alert 1'

Result: The result indicates that 10357496 is the Node ID of Alert 1.

row
   id(): 10357496

Now you can update Alert 1 with a streamlined MERGE statement like this:

MERGE INTO 10357496 VALUES {
  name[true] => 'Renamed Alert 1' 
}

For more examples, see Examples and More examples.

Best practices and troubleshooting

WARNING! Be careful with the MERGE statement. With only a simple but all-encompasing statement, it is possible to delete your entire database with a statement this simple: MERGE INTO / VALUES {};

Identifying condition fields

Problems can occur if you use a MERGE statement in which you use a field in a condition to select a node, in a case where the field does not actually exist. For example, look at the following MERGE statement:

MERGE into / values {
  network[true] => {
    device[interface/mac_address = '00:A0:B0:C0:D0:E0' 
    OR interface/inet/ip_address = '10.10.10.10'] => {
      last_seen[true] => '2008-02-19T18:42:10.0000000Z'
    }
  }
}

If you run this statement, it creates a device, but does not create the interface sub-tree. Therefore, if you run this same MERGE statement again, it creates a second device instead of merely updating the first device.

Merging data from different sources

When data about a device is gathered from different sources, they often differ in how they identify the device. One plugin identifies the device by its IP address, and another by its MAC address. As part of best practices, always include both addresses in your MERGE statements so that data from different sources merges properly. Write this condition in the same order: mac_address OR ip_address as in the following example:

MERGE into / values {
  network[true] => {
    device[interface/mac_address = '00:A0:B0:C0:D0:E0' 
        OR interface/inet/ip_address = '10.10.10.10'] => {
      last_seen[true] => '2008-02-19T18:42:10.0000000Z'
    }
  }
}

Bad merge example

It might be helpful to examine a poorly-written MERGE statement and see what's wrong with it. The following example attempts to identify an interface that has an IP address of 10.20.30.40 or a MAX address of 00:A0:B0:C0:D0:E0, and instead creates problems in a couple of ways:

Incorrect MERGE statement:

MERGE INTO / values {
  network[true] => {
    device[interface[inet/ip_address = '10.20.30.40' 
    or mac_address = '00:A0:B0:C0:D0:E0'] => {
      mac_address[true] => '00:A0:B0:C0:D0:E0',
      inet => {
        ip_address[true] => '10.20.30.40'
      }
    }
  }
}

One problem is that the same clause that provides the conditions to match, also adds the sub-tree. That means that if GoToAssist doesn't find this sub-tree in the database already, it creates it new.

In other words, this MERGE statement adds an IP address of 10.20.30.40 and a MAC address of 00:A0:B0:C0:D0:E0 if it doesn't find them already in existence. Was that your intention? If not, you have added data accidently.

Another problem occurs if you run the same MERGE statement again at a later date, to add or update the database. Since the sub-tree now already exists, the MERGE statement creates a second inet clause and results in an interface with two identical IP addresses:

Result:

. . .
    interface/
       mac_address = '00:A0:B0:C0:D0:E0'
       inet/
         ip_address = '10.20.30.40'
       inet/
         ip_address = '10.20.30.40'
. . .

The reason this happens is an omission in the following lines of the original MERGE statement:

       inet => {
         ip_address[true] => '10.20.30.40'
       }

hese lines do not specify which inet tree to update, so the MERGE statement creates a second inet sub-tree.

To correct this omission, apply [true] to the inet clause:

       inet[true] => {
         ip_address[true] => '10.20.30.40'
       }

This would match any inet sub-tree under interface. However, what about an interface that has more than one IP address? The above inet[true] clause matches any and all inet sub-trees under interface and then sets the IP address for all of them to 10.20.30.40. For example, if you start with this interface in your database:

Database:

     ../interface/
       inet/
         ip_address = '10.20.30.40'
       inet/
         ip_address = '10.10.10.1'

After the MERGE statement you would have this interface in your database:

Result:

     ../interface/
       inet/
         ip_address = '10.20.30.40'
       inet/
         ip_address = '10.20.30.40'

The correct way to write this merge statement is to fully specify the condition clause for every non-leaf node, as follows:

Correct MERGE statement:

MERGE INTO / values {
  network[true] => {
    device[interface[inet/ip_address = '10.20.30.40' 
    or mac_address = '00:A0:B0:C0:D0:E0'] => {
      mac_address[true] => '00:A0:B0:C0:D0:E0',
        inet[ip_address = '10.20.30.40'] => {
         ip_address[true] => '10.20.30.40'
      }
    }
  }
}

Conclusion: Remember that when you have a sub-tree that may or may not exist, or when you may have multiple sub-trees with the same node name, construct your condition clauses so that you full identify every non-leaf node, and thereby update the correct sub-tree.

Related topics

Did this article answer your question?
Yes
No
Why?