[5 steps] TensorFlow Basic Tutorial

This tutorial is for beginners who are just trying to understand how tensorflow can be used to perform a prediction using high level api

Basic Blocks of TensorFlow

There are 5 steps in general to perform the prediction of any given data,

1) Create Input Training Data & its traning input function
2) Create Input Test Data & its testing input function
3) Choose feature columns and existing Classifier or custom Classifier
4) Train and Test the Model using Classifier
5) Perform Prediction using the model

Create Input Training Data & its traning input function

We need to make sure we have good quality data for the machine learning model to perform accurate prediction, more the traning data better the model prediction outcome will be.

In this example we will use the commonly used iris training data which comes in a simple csv file with 4 features and species as the fifth column

In the below code we are reading the CSV data in a pandas dataframe and splitting the entire data in to two seperate set which will be used for traning and testing seperately

inputdata = pd.read_csv(DATAPATH, names=ALL_CSV_COLUMNS, header=0)

train_x, test_x, train_y, test_y = sk.train_test_split(inputdata.loc[:,FEATURE_COLUMNS], inputdata.loc[:,'Species'],test_size=0.10)

Now input function is used to properly feed the data in batches to the tensorflow classifier

def train_input_fn(features, labels, batch_size):

    # Convert the inputs to a Dataset.
    dataset = tf.data.Dataset.from_tensor_slices((dict(features), labels))

    # Shuffle, repeat, and batch the examples.
    dataset = dataset.shuffle(1000).repeat().batch(batch_size)

    # Return the dataset.
    return dataset

Create Input Test Data & its testing input function

Now using the split testing data we can now do the same input function for evaluation or testing function, also here we have option for both labeled and non labeled testing dataset

def eval_input_fn(features, labels, batch_size):
    features=dict(features)
    if labels is None:
        # No labels, use only features.
        inputs = features
    else:
        inputs = (features, labels)

    # Convert the inputs to a Dataset.
    dataset = tf.data.Dataset.from_tensor_slices(inputs)

    # Batch the examples
    assert batch_size is not None, "batch_size must not be None"
    dataset = dataset.batch(batch_size)

    # Return the dataset.
    return dataset

Choose feature columns and existing Classifier or custom Classifier

Now we are selecting the feature column names for the classifier and choose real valued numeric data for the feature columns

my_feature_columns = []

for key in train_x.keys():
    my_feature_columns.append(tf.feature_column.numeric_column(key=key))

Choose feature columns and existing Classifier or custom Classifier

Choose an appropriate classifier for the given data for best results, below we are using DNNClassifier

# Evaluate the model.
classifier = tf.estimator.DNNClassifier(
    feature_columns=my_feature_columns,
    hidden_units=[10, 10],
    n_classes=3)

Train and Test the Model using Classifier

Now lets train and test the classifier using the above DNNClassifier and input functions

#Train the Model.
classifier.train(
    input_fn=lambda:train_input_fn(train_x, train_y, batch_size),
    steps=train_steps)

# Evaluate the model.
eval_result = classifier.evaluate(
    input_fn=lambda:eval_input_fn(test_x, test_y, batch_size))
print('\nTest set accuracy: {accuracy:0.3f}\n'.format(**eval_result))

Perform Prediction using the model

On the final step, the model is now ready for the prediction. we will use a predetermined set of input data to verify the accuracy of the model

# Generate predictions from the model
expected = ['Setosa', 'Versicolor', 'Virginica']
predict_x = {
    'SepalLength': [5.1, 5.9, 6.9],
    'SepalWidth': [3.3, 3.0, 3.1],
    'PetalLength': [1.7, 4.2, 5.4],
    'PetalWidth': [0.5, 1.5, 2.1],
}

predictions = classifier.predict(
    input_fn=lambda:eval_input_fn(predict_x, labels=None, batch_size=batch_size))

The outcome of the prediction is returned as a generator object which can be iterated to see our results

for pred_dict, expec in zip(predictions, expected):
    template = ('\nPrediction is "{}" ({:.1f}%), expected "{}"')

    class_id = pred_dict['class_ids'][0]
    probability = pred_dict['probabilities'][class_id]
    print(template.format(SPECIES[class_id], 100 * probability, expec))

I hope we were able to explain the tensorflow high level API in a very simple step by step approach and please feel free to contact support@clofus.com for any queries thanks

Karthik Balu

CEO/Clofus innovations