Image Analysis Using Tensor Flow

Source -
Source -

                                                           Image Analysis Using Tensor Flow

By Nishank Biswas

Tensor Flow is an open source software library which is designed to perform large-scale numerical computation under the frame work of sophisticatedly designed network. The typical assembly of the frame work provides an extra edge in terms of efficiency especially when it comes to performing the simultaneous operations in different environments along with exporting and importing necessary to keep the connections with the working interface. For an instance, in order to perform extensive numerical computation in python it uses libraries like Numpy so as to perform expensive calculations outside the python environment with the help of efficient codes written in other languages which can significantly hamper the performance in terms of cost of transitions required therefore in the context of minimizing the cost, Tensor Flow involves a frame work called graph which dictates the interaction operations taking place outside the python environment. The introduction of the idea of graph not only provides advantage over computational efficiency but also shows a way out for structuring the algorithm involved thereby allowing the encryption of the methods in the network itself with the help of nodes and edges which are used to store the mathematical operations and the tensors involved in the computations respectively. As we move into the nuts and bolts, we realize that this library uses a slight different representation of the data sets involved to interact between the operations in the computation graphs, which also works like a place holder attributed to a node, in the form of multi-dimensional Numpy array which is called tensor. The overall familiarization then allows us to have a broader picture about the stuffs which we make happen in the graph. The graph initiates with defining the attributes of the place holders which happens to be of tensor data structure and would be used as a storage for training, validation and test data set which would be ultimately propagated to the respective nodes. Among few crucial build ups required to assemble the graph, the network to define the feed forward propagation to produce the output serves as the primary one as it holds the defining parameters of the complex neural network used for training and predictions viz., hidden layer attributes, activation functions, weights and bias distribution. The training of the neural network is always centralised around a cost function associated with the output along with the penalties and therefore the cost function serves as the next build up for the graph along with the node designed for training. The training node is fabricated in a way to train the neural network by adjusting the weights and bias with reference to the training dataset so as to minimize the cost function with the use of sophisticated algorithm viz., gradient descent, stochastic gradient descent. Having designed the frame work of the graph, it is now required to execute the operations over the data set and which can by using a session object which gives command for execution and this tensor flow implementation ultimately translates the graph definition into executable operations distributed across the available computer resources along with that tensor flow also provides a provision to explicitly define the CPU or GPU which are to be used for the operation. After the accomplishment of the training task the suitable addition to the graph frame work can be attributed based on the criteria to evaluate the performance of the model towards the unseen test data set. This mechanism of processing the data also accounts for the high susceptibility of the structure over most of the domain like Natural Language processing, Image classification and Recognition system, Multi-Class Classification using Convolutional Neural Network.

Image Analysis using Tensor Flow

Image recognition systems are designed in a way such that it can extract the key features from the images data set passed on to it and able to react sophisticatedly on seeing the same features again in the future thereby imparting an artificial vision to the system. The performance of such system can be taken into another level on synchronising the deep convolutional neural network used for image analysis with the sophisticated frame work provided by tensor flow. We can now explore the adaptability of the system along with the versatility of the libraries provided by tensor flow by understanding the implementation of image analysis on it. The analysis begins with the introduction of image data sets into python environment followed by the definition of the graph which is initiated by the definition of the place holders which in this case suppose to be used for storing pixel intensities of the image data sets in the form of multi-dimensional numpy array or tensor. This is then followed by the random initialisation of the network parameters which happens to be the weights and bias corresponding to the consecutive layers in the neural network thus required for symmetry breaking and preventing the situation of zero gradient. It is worthwhile to note that the present day sophisticated classifier uses the extract of deep learning system in the context of convolutional neural network to get outrageous performance when compared otherwise and tensor flow provides an opportunity to inculcate the same in the graph definition by the virtue of its predefined libraries. On many of the classification task the CNN had upper hand as compared to traditional multilayer perceptron (MLP) as they are designed to emulate the behaviour of human visual cortex thereby concentrating on strong special local correlation in 3 dimensional synaptical connections which are in turn fabricated by set of controlling features like depth, stride, padding & pooling and thus avoiding the consequence of overfitting by optimising the complexity of the network. These architecture are then implemented for image analysis in the context of libraries provided by tensor flow to instantiate artificial computer vision.

Implementation in Python 

In order to implement image analysis using tensor flow in python, we will use the MNIST data set available at and to keep the focus on optical character recognition with convolutional neural network using tensor flow, we will use the processed image data set in terms of pixel representation, format, size, labels and images compartmentalised into training, validation and testing data set. And then we would start by importing the required libraries along with the definition of crucial parameters.

import numpy

from six.moves import xrange

import tensorflow as tf





VALIDATION_SIZE = 5000  # Size of the validation set.

SEED = 66478  # Set to None for random seed.




EVAL_FREQUENCY = 100   # Number of steps between evaluations.


We then define the metric which would be used as a proxy for the performance of the trained model on the test data set.

def error_rate(predictions, labels):

“””Error Rate:”””

return 100.0 – (100.0 * numpy.sum(numpy.argmax(predictions, 1) == labels) /



We can now start the fabrication of the graph by initialising the place holder which would be used as storage for image pixels and variables for holding the network parameters such as weights and bias.

def main(argv=None):

train_data_node = tf.placeholder( tf.float32,


train_labels_node = tf.placeholder(tf.int64, shape=(BATCH_SIZE,))

eval_data = tf.placeholder(


conv1_weights = tf.Variable(

tf.truncated_normal([5, 5, NUM_CHANNELS, 32],  # 5×5 filter, depth 32.

stddev=0.1, seed=SEED))

conv1_biases = tf.Variable(tf.zeros([32]))

conv2_weights = tf.Variable( tf.truncated_normal([5, 5, 32, 64], stddev=0.1, seed=SEED))

conv2_biases = tf.Variable(tf.constant(0.1, shape=[64]))

fc1_weights = tf.Variable( tf.truncated_normal(

[IMAGE_SIZE // 4 * IMAGE_SIZE // 4 * 64, 512], stddev=0.1, seed=SEED))

fc1_biases = tf.Variable(tf.constant(0.1, shape=[512]))

fc2_weights = tf.Variable( tf.truncated_normal([512, NUM_LABELS], stddev=0.1, seed=SEED))

fc2_biases = tf.Variable(tf.constant(0.1, shape=[NUM_LABELS]))


Having done all of that we would now introduce the nodes along with the typical CNN parameters to make the model learn from the data sets.

def model(data, train=False):

“””The Model definition.”””

# 2D convolution, with ‘SAME’ padding (i.e. the output feature map has

# the same size as the input). Note that {strides} is a 4D array whose

# shape matches the data layout: [image index, y, x, depth].

conv = tf.nn.conv2d(data, conv1_weights, strides=[1, 1, 1, 1], padding=’SAME’)

relu = tf.nn.relu(tf.nn.bias_add(conv, conv1_biases))

# Here we have a pooling window of 2, and a stride of 2.

pool = tf.nn.max_pool(relu, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding=’SAME’)

conv = tf.nn.conv2d(pool, conv2_weights, strides=[1, 1, 1, 1], padding=’SAME’)

relu = tf.nn.relu(tf.nn.bias_add(conv, conv2_biases))

pool = tf.nn.max_pool(relu, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding=’SAME’)

# Reshape the feature map cuboid into a 2D matrix to feed it to the

# fully connected layers.

pool_shape = pool.get_shape().as_list()

reshape = tf.reshape( pool, [pool_shape[0], pool_shape[1] * pool_shape[2] * pool_shape[3]])

# Fully connected layer. Note that the ‘+’ operation automatically

# broadcasts the biases.

hidden = tf.nn.relu(tf.matmul(reshape, fc1_weights) + fc1_biases)

# Add a 50% dropout during training only. Dropout also scales

# activations such that no rescaling is needed at evaluation time.

if train:

hidden = tf.nn.dropout(hidden, 0.5, seed=SEED)

return tf.matmul(hidden, fc2_weights) + fc2_biases


# Training computation: logits + cross-entropy loss.

logits = model(train_data_node, True)

loss = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(

logits, train_labels_node))

optimizer = tf.train.AdamOptimizer(1e-4).minimize(loss,global_step=batch)


# Predictions for the current training minibatch.

train_prediction = tf.nn.softmax(logits)

# Predictions for the test and validation.

eval_prediction = tf.nn.softmax(model(eval_data))


# Small utility function to evaluate a dataset by feeding batches of data to

# {eval_data} and pulling the results from {eval_predictions}.

# Saves memory and enables this to run on smaller GPUs.

def eval_in_batches(data, sess):

“””Get all predictions for a dataset by running it in small batches.”””

size = data.shape[0]

predictions = numpy.ndarray(shape=(size, NUM_LABELS), dtype=numpy.float32)

for begin in xrange(0, size, EVAL_BATCH_SIZE):

batch_predictions =,

feed_dict={eval_data: data[-EVAL_BATCH_SIZE:, …]})

predictions[begin:, :] = batch_predictions[begin – size:, :]

return predictions


After the definition of the graph all that remain is the command to trigger the execution of the nodes from the graph by the virtue of the session on the processing system.

with tf.Session() as sess:

# Run all the initializers to prepare the trainable parameters.



# Loop through training steps.

for step in xrange(int(num_epochs * train_size) // BATCH_SIZE):

offset = (step * BATCH_SIZE) % (train_size – BATCH_SIZE)

batch_data = train_data[offset:(offset + BATCH_SIZE), …]

batch_labels = train_labels[offset:(offset + BATCH_SIZE)]

feed_dict = {train_data_node: batch_data, train_labels_node: batch_labels}

# Run the graph and fetch some of the nodes.

_, l, lr, predictions = [optimizer, loss, learning_rate, train_prediction],



# Finally print the result!

test_error = error_rate(eval_in_batches(test_data, sess), test_labels)

print(‘Test error: %.1f%%’ % test_error)


Application of Tensor Flow

Some of the industrial application of Tensor Flow along with the combination of convolutional neural network can be described as

Natural Language Processing (NLP)

It is a field which is focused on building efficient interface between human natural language and commands executable by the computers. The backbone of the processing lies on decrypting the natural languages into something which can be understood by the system and that is achieved by conversion of character data set into sparse matrix with the use of tensor flow library which ultimately makes the data compatible to undergo further processing under the light of deep learning algorithm to produce NLP software capable of automatic summarization, machine translation, speech recognition, sentiment analysis, named entity recognition, etc.

Drug Discovery

The focus of this domain is basically on the examination of various interactions between drugs and the biological targets to estimate the likelihood of the treatments to be safe and effective by the virtue of convolutional neural network architecture adapted based on the large scale data sets emphasising on the chemical features such as aromaticity, sp3 carbons and hydrogen bonding.

Email Classification and Smart Reply

This area of research focuses on the classification of email into subcategories on the basis of the body content into different user define subcategories by deploying deep neural networks which also helps to extracts the crucial features which in this case happens to be the text contents on the basis of which a prototype of the reply is created to make the system user friendly to its peak.

Image Analysis

Tensor Flow library provides an ample number to modules to facilitate image processing such as encoding and decoding images, resizing and cropping, flipping and transposing, properties adjustment like brightness, contrast, hue, saturation, whitening along with conversion variety between colorspaces by inculcating the combination of possibilities between RGB, Greyscale & HSV, which can come handy while performing image analysis and recognition task and some of the practical applications of which are as follows

– Photo Optical Character Reader (OCR)

This algorithms works on the cushion of sliding window pixel detection over the images to perform text detection which is then followed by character segmentation and ultimately classification based on the existing character data set under the shadow of supervised machine learning algorithm. And these algorithms are used by many of the online retailing companies which facilitate price comparing along with recommending options from the mobile app on the basis of captured image of the product.

– Element Classification

In the context of real time image analysis, it is all the more possible to have more than one element of interest in the image distributed and hence the element classification provides a way out to pin point the elements for which the machine learning algorithm has been trained for which nonetheless provides an instance of descriptive computer vision.

– Anomaly Detection / Pattern Recognition

These image recognition algorithms are mostly used in the domain of material science, where detection of micro porous cracks or spots like anomalities becomes crucial in order to prevent the alteration of material property significantly, along with metallography when it becomes worthwhile to ensure the pattern consistency in the mineral of interest.

– Object Detection

Many of the autonomous ground vehicle driving algorithms uses the properties of the object detection as one of the crucial aspects of working and an extension of which is face detection algorithm which rather defines the boundary but increase the complexity and is therefore reliably used in the areas of crowd surveillance for security system assessment.