Load Mnist database of handwritten digit via Python
The handwritten digits recognition may be your first project when you get started with Tensorflow or Pytorch, and the dataset used in this project is the Mnist database of handwritten digit.
目录
What is Mnist?
Introduction
The MNIST database (Modified National Institute of Standards and Technology database) is a large database of handwritten digits that is commonly used for training various image processing systems.The database is also widely used for training and testing in the field of machine learning.It was created by “re-mixing” the samples from NIST’s original datasets. Source: Wikipedia.
Get Mnist
You can download the Mnist database from its website, http://yann.lecun.com/exdb/mnist/index.html.
There are four compressed files incluing training set images, training set labels, test set images and test set labels.
The format of Mnist
The data of Mnist is stored in a very simple file format. However, you can’t open them with common application program, and you need write your own program to read these files.
The above website has given the format of both image files and label files.
Oh, there is a little problem about thd data.
All the integers in the files are stored in the MSB first (high endian) format used by most non-Intel processors. Users of Intel processors and other low-endian machines must flip the bytes of the header.
It points out that the integers in the files is differen from the integers in Intel processor which used by most people’s computers.
Let’s use 0x00000102 as an example. We start the memory address from 0x0000.
In Intel processor, the 0x00000102 was stored like this.
address | value |
---|---|
0x0000 | 0x02 |
0x0001 | 0x01 |
0x0002 | 0x00 |
0x0003 | 0x00 |
In most non-Intel processor, it likes this.
address | value |
---|---|
0x0000 | 0x00 |
0x0001 | 0x00 |
0x0002 | 0x01 |
0x0003 | 0x02 |
Now, you have known the difference.
The format of images
1 | [offset] [type] [value] [description] |
The first integer is a magic number 0x803. I guess it was used to comfirm whether the file is parsed correctly. You can compare your first interget with 0x803 when you read the file using your own program.
The next three integers are number of images, the rows of image and the cols of image. Their meaning is obvious.
Other data is the pixel of images, ranging from 0 to 255.
The format of Labels
1 | [offset] [type] [value] [description] |
The first integer is also a magic number, but it is different from the image files. The next integer is number of labels.
Other data is the label of images, ranging from 0 to 9.
Load Mnist via Python
We will import a python package help us to produce the bytes data effectively.
It is struct
which is a build-in class of python. The struct
class is very easy, you can learn about it from Python API reference manual.
Load images
1 | import matplotlib.pyplot as plt |
First, we need to read four integers in the head of images file. Then we verify the magic number, there are some bugs if they are not corressonding.
After that we read some images accoding to a variant count
.
Now, you may understand the steps of struct
.
- use function
calcsize
get the size of bytes. - read some bytes from file.
- use function
unpack
get some variants according to designative format.
Finally, the variant images
is a long vector, it is necessary to reshape it use numpy before return it.
Load labels
1 | def load_label(filename, count=-1): |
The code of load_label
is same as load_image
. You are so smart that you will understand it.
Show images
1 | def show_image(images, count): |
We use matplotlib
to show loaded images.
The result:
