Error Handling

A-Stack Runtime Error Message

Below are the list of critical runtime error messages that must be looked for in a deployed A-Stack environment. 

Error TypeMessageGrep StringDescriptionResolution
ERROR

%s is used for %s more than %s msec", this, time, maxAllocTime

" is used for "

One of the TQL to SQL Mapped query is taking more than 6000 millisecond (default max time) to return.


Slow DB responses will severely degrade the performance of the runtime and may result in backed up queues, and client connection time outs.

  1. Revisit modeling.
  2. Check for Database performance.
ERROR

%s save error[%s]: %s.%s.%s:%s=%s

cn, exc.getSQLState(), aInstance, aAttribute, strOrder(aOrder)

" save error "Error while Saving TQL Model.
  1. Check for database connection
  2. Check for parameter values of Save Operation, make sure the attribute types and values match. For example: passing double value to a String attribute type
ERROR

%s update error[%s]: %s.%s.%s:%s=%s

cn, exc.getSQLState(), aInstance, aAttribute, strOrder(aOrder)

" update error"Error while Updating TQL Model Attribute
  1. Check for database connection
  2. Check for parameter values of Update Operation, make sure the attribute types and values match. For example: passing double value to a String attribute type
ERROR

%s create error[%s]: %s.%s.%s

cn, exc.getSQLState(), aInstance, aAttribute, strOrder(aOrder)

" create error"Error while Creating TQL Model Attribute
  1. Check for database connection
  2. Check for parameter values of Create Operation, make sure the attribute types and values match. For example: passing double value to a String attribute type
ERROR

%s delete value error[%s]: %s.%s.%s

Parameters are: 

cn, exc.getSQLState(), aInstance, aAttribute, strOrder(aOrder)

" delete value error"Error while Deleting TQL Model Attribute
  1. Check for database connection
  2. Check for parameter values of Create Operation, make sure the attribute types and values match. For example: passing double value to a String attribute type
ERROR

Network peer connect failed for:\n%s

"Network peer connect"Peer connection has failed. 
  1. Check for error and resolve it. Example peer node is dead; Network failure
  2. Peers will try reconnect indefinitely.
ERROR

Peer message not sent:\n%s

"Peer message not sent"Peer is not able to send message across to one of connected peer in cluster.
  1. Check for error and resolve it. Example peer node is dead; Network failure
ERROR 

error("Memory alarm %s: %.4f: channel(%s/%.4f) closed: %s",

alarmCount, alarm, strFacetTitle(facet), importance, String.valueOf(chl));

"Memory alarm "Based on the Alarm value set, the Channel will reject incoming requests.
  1. Memory monitor feature. New requests will be rejected once alarm level is reached.
ERROR

Response lost: %s\n'%s'", server, strMessage(msg, maxPrintLength)

"Response lost"Clients may have timed out and the response was not sent to client. 
  1. This is indicative of busy or loaded (burst) of connections on a transport (Http)
ERROR

HSQL DB has unique constraint violation exception as shown below Caused by: org.hsqldb.HsqlException: error in script file line: 59 org.hsqldb.HsqlException: integrity constraint violation: unique constraint or index violation; SYS_PK_10098 table: TR

"integrity constraint violation"This error indicates that the HSQL Database failed to remove the record from the log file after committing to the data file. 
  1. This error happens if the database failed to shutdown gracefully i.e. the engine has crashed.
  2. To resolve this - Delete the log file insert statements and restart the engine.
  3. If the log file contains lot of insert statements then make sure after database comes back up; query the database and find rows that does not exist in the log file and re-insert them. Essentially, find the offending duplicate record and prevent it from being inserted.
WARN

%s Forbidden: No facet ID\n'%s'\n",  String.valueOf(chl), NettyUtils.strMessage(msg, maxPrintLength)

"No facet ID"

Some external client is trying to access the EndPoint this is not available. 

  1. Make sure we don't get lot of these messages in bunch - may be due to Spamming or some application has hard wire the Facet Id. Facet Ids are meant to be dynamic in nature with timeouts.
WARN

ModifyPipeline: AutoClosed(%s): %s", timeout, pipeline

"MofiyPipeline: AutoClosed"Some Protocol Handler has auto closed after number of attempts
  1. Check destination that is closing the connection
WARN

"%s: Maximum queue size (%d) reached: %d", getName(), defMaxQueueSize, size

Example: TestFacet: Maximum queue size (1000) reached: 1000. Maximum queue size. Maximum queue size is configured via sff.max.queue.size parameter

How to reproduce this sceanrio:

  1. Import HelloTQL project
  2. Create a Suscription Action to TempValue attribute. In the action do a Thread.sleep(6000) Essentially every change in TempValue will result in action sleeping. 

    SubscribeToHelloTQL
    #
    SubscribeToHelloTQL:
        TopicName: *Atomiton.Sensors.TempSensor.TempValue*
        TopicID: LongRunningAction
        ActionName:
            logInfo: "Calling SubscAction..."
  3. Run the InitSensor Query
  4. After a period of time you will notice Maximum Queue size reached warning show up in log
" Maximum queue size "
  1. This warning means that the  subscription Actions are not able to cope up with the rate at which the attributes on which the subscribe action is running. In the example to reproduce in HelloTQL, the rate of change of Tempvalue is 2sec and each Subscription Action is sleeping for 2minutes.
  2. Revisit all the subscription actions and try to balance the action processing against the rate of change of their respective attributes.
  3. A series of long running Subscription actions can result in increased dynamic heap size as internal Queues are used to hold actions. It can also result in overall slow down of the engine leading to other errors.
WARN

Network peer %s at %s is not available", qname, p_key

" Network peer "When cluster node is unable to connect to one of the peer.
  1. Check if peer is alive.
WARN

Connection refused:%s

Example:

SffTcpClient:855 IO error in SffNetworkFacet:wsNotificationCluster-3130539813663882016:ws>NioClientSocketChannel[id: 0x156d1566]; Caused by: java.net.ConnectException: Connection refused: /199.199.199.169:8082

" IO error "Connection exception while connecting to a peer.
  1. Network error connecting to a remote connection

Error Handling in Applications

Error handling is critical aspect of developing applications in any programming language. Error handling is a mechanism where applications can notify error conditions to end users. A-Stack is a collection of number of atomic domain languages, the error handling varies based on language the application is using. Error handling in A-Stack can be classified into three broad categories -

  • Error handling in NewFacetInstances
  • Error handling as part of process code in non-workflow definition language.
  • Error handling in Workflow Definition Language

Trapping Errors in NewFacetInstances

A-Stack provides <OnError> event as part of Facet life-cycle which can be used 

Trapping Errors in Queries

All TQL exceptions are caught by the TQL facet itself and reported in query/attribute status. TQL exceptions are not passed to error handler and instead we should check the status instead. In the example below we check for Find status value and make changes to the response. 

Step to Check for Find Status is:

Error Handling in Queries
#
if($Response.Message.Value.Find/Status == 'Error'):
	#...
else:
	#...
Error Handling in Queries
#
Query:
  Find(Format: "Version"):
    VendorInfo1:
      vendorId(ne: "")
            
  if($Response.Message.Value.Find/Status == "Error"):
      AddResponseData:
        Key: Message.Value.Error.Message
        Value: [:$Response.Message.Value.Find.Error:]
      AddResponseData:
        Key: Message.Value.Error.Code
        Value: 1002
        
      SetResponseData:
        Key: Message.Value.Find
        Value: ""

Response will be as below:

Customized Find Response
#
Error
	Message: "TQL Find failed: java.lang.IllegalArgumentException: {[3:6,8:19]} Target data model not found: VendorInfo1"
    Code: 1002