Recently, I was converting a torch model to mxnet model(The project will be released here) for the convenience of usage on Android OS. The input of the model is an 128*128
image with 1/4 in center is blank(0,0,0) and the output is 64*64
. However, when I test the two model with the same image, there are something wrong.
After debugging, I found the input matrix is not the same to the two kind of models. In torch, it calls image.load
from package image
, while in python, it calls cv2.load
from package opencv
input_torch = image.load('test.png')
input_cv2 = cv2.imread('test.png')
After reading the source code of the functions above, I found three important differences between them.
- The channels order.
loads image in RGB;cv2.imread
reads image in BGR. - Data format.
organizes pixels innc*w*h
range from0-1
organizes pixels inw*h*nc
range from 0-255. - Index. For
, image matrix starts with 1; Forcv2.imread
, image matrix starts with 0. It could be more about the difference between lua and python, instead ofimage.load
While, there is also another minor problem might be met. Usually, we want to set the pixels of a area to be a constant value, such as k. Compared with C/C++, python and lua provide convenient methods to do it.
For lua,
input_torch[{{1},{37,92},{37,92}}] = 0
For python,
input_cv2[36:92,36:92,0] = 0
The above means are the same. Since lua and python index from 1 and 0 separately, and lua includes the head(37) and the tail(92) in the change area, while python only includes the head(36) in and the front of the tail(92-1).
That is what I have remembered in this project, please do not hesitate to contact me if you have any problems or something wrong in this po