autoBOTLib library

Next follows a minimal usecase, where you are introduced to basic autoBOTLib functionality. The data used in the example is accessible at: https://github.com/SkBlaz/autobot/tree/master/data

The minimal example is given next. Let’s first inspect how a model is trained.

import autoBOTLib
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

## Load example data frame
dataframe = pd.read_csv("../data/insults/train.tsv", sep="\t")
train_sequences = dataframe['text_a'].values.tolist()
train_targets = dataframe['label'].values

autoBOTLibObj = autoBOTLib.GAlearner(
        train_sequences,  # input sequences
        train_targets,  # target space
        time_constraint=1,  # time in hours
        num_cpu="all",  # number of CPUs to use
        task_name="example test",  # task identifier
        scoring_metric = "f1", # sklearn-compatible scoring metric as the fitness.
        hof_size=3,  # size of the hall of fame
        top_k_importances=25,  # how many top features to output as final ranking
        memory_storage=
        "./memory",  # tripled base for concept features (see ./examples folder)
        representation_type="neurosymbolic")  # or symbolic or neural or neurosymbolic (neurosymbolic includes doc2graph transformation which is in beta)

autoBOTLibObj.evolve(
                nind=10,  ## population size
                crossover_proba=0.6,  ## crossover rate
                mutpb=0.4)  ## mutation rate

The autoBOTLibObj object now contains a trained model, explanations and other relevant information. Let’s explore its capabilities next.

We can first visualize the evolution’s trace:

## visualize fitnesses

autoBOTLibObj.visualize_fitness(image_path = "fitness.png")
_images/fitness.png

As autoBOTLib is fully explainable, we can explore the two layers of explanations as follows:

## store global importances
importances_local, importances_global = autoBOTLibObj.feature_type_importances()
print(importances_global)

Which results in subspace feature importances (importances_global):

Importance            Feature subspace
0.4124583243111468    word_features
0.2811283792683306    char_features
0.27482709838903063   pos_features
1.0036820174140975    relational_features
0.5351954677290582    keyword_features
0.0                   concept_features
0.4983623274641806    neural_features_dm
0.2565542438450016    neural_features_dbow

and the subspace-level rankings (importances_local):

                keyword_features       char_features   word_features   pos_features    relational_features     concept_features        neural_features_dm      neural_features_dbow
0      moron : 2.76    ck : 1.06       fake : 1.26     prp vbp dt : 3.42       o--3--d : 3.31  antonym(act,nothing) : 0.0      13_1 : 1.41     183_0 : 0.55
1      idiot : 2.62    fuc : 0.8       pig : 1.14      vbp dt : 2.99   n--15--s : 2.96 antonym(act,real) : 0.0 323_1 : 1.41    321_0 : 0.54
2      loser : 2.04    uck : 0.79      go back : 0.87  nn : 2.56        --3--c : 2.96  antonym(around,far) : 0.0       217_1 : 1.37    126_0 : 0.53
3      fa**ot : 1.99   f*ck : 0.77     azz : 0.58      prp vbp : 2.06  r--2--p : 2.84  antonym(ask,tell) : 0.0 414_1 : 1.26    337_0 : 0.52
4      ignorant : 1.57 fu : 0.69       jerk : 0.44     vbp dt jj : 2.0 u--2--s : 2.77  antonym(away,back) : 0.0        259_1 : 1.21    223_0 : 0.51
5      b*tch : 1.56    pi : 0.68       liar : 0.44     vbp dt nn : 1.74        n--6--g : 2.75  antonym(away,come) : 0.0        311_1 : 1.21    72_0 : 0.5
6      stupid : 1.49   gg : 0.66       stfu : 0.44     prp : 1.48      e--14--f : 2.74 antonym(away,stay) : 0.0        89_1 : 1.13     271_0 : 0.47
7      mouth : 1.47    uc : 0.65       ass ni**a : 0.39        vbp : 1.47       --10--t : 2.72 antonym(away,stay) relatedto(away,far) : 0.0    91_1 : 1.12     335_0 : 0.45
8      retarded : 1.39 u  : 0.64       otr : 0.39      in : 1.44       c--4--g : 2.69  antonym(away,stay) relatedto(away,way) : 0.0    36_1 : 1.09     112_0 : 0.44
9      kidding : 1.21  dumb : 0.63     smug : 0.37     prp nn : 1.21   a--7--t : 2.68  antonym(bad,right) : 0.0        391_1 : 1.09    244_0 : 0.42

Finally, to explore the properties of individual classifiers in the final ensemble, you can obtain the table of results as:

final_learners = autoBOTLibObj.summarise_final_learners()
print(final_learners)

Putting it all together - an automated report can be obtained as follows.

autoBOTLibObj.generate_report("report_folder")

For more examples and usecases, please inspect the examples folder!