Daily updates:

Present Enbridge implementation - https://mqidentity-my.sharepoint.com/:w:/g/personal/nkhadri_atomiton_com/ERo7AJWT0rNIv78t_72XPNwBBAANUTov0WtpikAcFRcvaQ?e=kOP3oE

 

10th May / 11th May

  1. All the reusable functions are moved to new class - library.py in DAS Project

    2. Input meter data csv, feature engineering config and out location is passed in DAS config

3. DAS script: getting the arguments and calling the library functions and saving into csv. Need support to save back to DB from python.

4. Config in ConfigObj (third party library alternative to default configParser) - comment is #

5. Output will have the FMV eps, duration, mag columns for measurements in config

7th May 2021

Config files:

  1. present ini: Using Python builtin configparser module - without nested section and list… I have written lot of code to overcome these limitations

  2. ConfigObj - third party: very similar to present ini, with minor syntax. This has nested and list features. Easy to generalize the config file.

  3. JSON, XML - known to developer fraternity, but may not except layman. As config could be edited from operation and customer itself(not now, but sometime later)

  4. Hydra.cc from Facebook - takes YAML as input bit has lot of features. With logging, debugging, and even it can change the config in command line.. It's way too powerful, the question is do we even need these complexities?

Config parser eg:

keyword1 = value1 keyword2 = value2 [section 1] # with value as list keyword1 = value1, value2, value3 keyword2 = value2 # Sub section [[sub-section]] # this is in section 1 keyword1 = value1 keyword2 = value2 [[[nested section]]] # this is in sub section keyword1 = value1 keyword2 = value2 [[sub-section2]] # this is in section 1 again keyword1 = value1 keyword2 = value2 [section 2] keyword1 = value1 keyword2 = value2

So easy to access the values.

Generalize command-line argument, We can have a class, and function which can abstract the variables passing value, help, default values.. But the limitation is arguments would be passed in a dictionaly.

 

6th May 2021:

After defining the reusable functions, what else is needed for DAS users to implement feature engineering.

DAS user:

  1. To get the Config values, is “.ini” the right format? - how to make config format standardize in reading and parsing

  2. Passing the appropriate values to the corresponding function

  3. Stich the previous batch data with this batch

  4. Save intermediate data in CSV and png.

  5. Read CSV with index_name, date format,

  6. Way to get the command line arguments

Solution:

  1. Configs can be in a different format, need not be a standard.

  2. To carry forward episode duration from the previous batch to the present batch, we need some code to get them in the format which is required for the defined format.

  3. As we need previous batch data, to take average of 3 mins, we need to change the API of SVC, DIV, RVF.

  4. Intermediate files to be saved should be included in API.

  5. read_file (fileName, index_col, isIndexColDate=False, removeDulplicateIndexWithKeep=None)

  6. CLASS
    ArgumentClass(varName, help=None, defaultVal=None)
    Eg:
    arg1 = ArgClass("a")
    arg2 = ArgClass("--b","help bb")
    arg3 = ArgClass("--c")

    Function:
    argumentDefinition(ArgumentClass objects....)
    Eg:
    argumentDefinition(arg1,arg2,arg3)

    return dictionary with key od variable name and value passed in command line argument