Hostname: page-component-cd9895bd7-fscjk Total loading time: 0 Render date: 2024-12-28T03:06:50.919Z Has data issue: false hasContentIssue false

Multi-view object instance recognition in an industrial context

Published online by Cambridge University Press:  23 June 2015

Wail Mustafa*
Affiliation:
Mærsk Mc-Kinney Møller Institute, University of Southern Denmark, Campusvej 55, DK-5230 Odense M, Denmark. Email: wail@mmmi.sdu.dk
Nicolas Pugeault
Affiliation:
Centre for Vision, Speech and Signal Processing, Faculty of Engineering & Physical Sciences, University of Surrey, Guildford GU2 7XH, UK
Anders G. Buch
Affiliation:
Mærsk Mc-Kinney Møller Institute, University of Southern Denmark, Campusvej 55, DK-5230 Odense M, Denmark. Email: wail@mmmi.sdu.dk
Norbert Krüger
Affiliation:
Mærsk Mc-Kinney Møller Institute, University of Southern Denmark, Campusvej 55, DK-5230 Odense M, Denmark. Email: wail@mmmi.sdu.dk
*
*Corresponding author. Email: wail@mmmi.sdu.dk

Summary

We present a fast object recognition system coding shape by viewpoint invariant geometric relations and appearance information. In our advanced industrial work-cell, the system can observe the work space of the robot by three pairs of Kinect and stereo cameras allowing for reliable and complete object information. From these sensors, we derive global viewpoint invariant shape features and robust color features making use of color normalization techniques.

We show that in such a set-up, our system can achieve high performance already with a very low number of training samples, which is crucial for user acceptance and that the use of multiple views is crucial for performance. This indicates that our approach can be used in controlled but realistic industrial contexts that require—besides high reliability—fast processing and an intuitive and easy use at the end-user side.

Type
Articles
Copyright
Copyright © Cambridge University Press 2015 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

1. Everingham, M., Gool, L. V., Williams, C., Winn, J. and Zisserman, A., “The PASCAL Visual Object Classes Challenge 2009 (VOC2009),” Summary presentation at the 2009 PASCAL VOC workshop, 10 2009.Google Scholar
2. Lai, K., Bo, L., Ren, X. and Fox, D., “A Large-Scale Hierarchical Multi-View rgb-d Object Dataset,” IEEE International Conference on Robotics and Automation (ICRA) (May 2011) pp. 1817–1824.Google Scholar
3. Pinto, A. M., Rocha, L. F. and Moreira, A. P., “Object recognition using laser range finder and machine learning techniques,” Robot. Comput.-Integr. Manuf. 29 (1), 1222 (2013). [Online]. Available: http://www.sciencedirect.com/science/article/pii/S0736584512000798.Google Scholar
4. Savarimuthu, T. R., Buch, A. G., Yang, Y., Mustafa, W., Haller, S., Papon, J., Ì~nez, D. M. and Aksoy, E. E., “Manipulation Monitoring and Robot Intervention in Complex Manipulation Sequences,” Workshop on Robotic Monitoring at the Robotics: Science and Systems Conference (RSS), 2014.Google Scholar
5. Amit, Y. and Geman, D., “Shape quantization and recognition with randomized trees,” Neural Comput. 9, 15451588 (1997).Google Scholar
6. Breiman, L., “Random forests,” Machine Learning 45 (1), 532 (2001).Google Scholar
7. Mustafa, W., Pugeault, N. and Krüger, N., “Multi-View Object Recognition using View-Point Invariant Shape Relations and Appearance Information,” IEEE International Conference on Robotics and Automation (ICRA) (2013).CrossRefGoogle Scholar
8. Everingham, M., Zisserman, A., Williams, C. K. I., van Gool, L., Allan, M., Bishop, C. M., Chapelle, O., Dalal, N., Deselaers, T., Dorko, G., Duffner, S., Eichhorn, J., Farquhar, J. D. R., Fritz, M., Garcia, C., Griffiths, T., Jurie, F., Keysers, D., Koskela, M., Laaksonen, J., Larlus, D., Leibe, B., Meng, H., Ney, H., Schiele, B., Schmid, C., Seemann, E., Taylor, J. S., Storkey, A., Szedmak, S., Triggs, B., Ulusoy, I., Viitaniemi, V. and Zhang, J., “The 2005 PASCAL Visual Object Classes Challenge,” Pascal Challenges Workshop, ser. LNAI, vol. 3944 (Springer, 2006) pp. 117176.Google Scholar
9. Pinto, N., DiCarlo, J. J. and Cox, D. D., “How far can you get with a modern face recognition test set using only simple features?” IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE (2009) pp. 2591–2598.Google Scholar
10. Gopalan, R., Li, R. and Chellappa, R., “Domain Adaptation for Object Recognition: An Unsupervised Approach,” IEEE International Conference on Computer Vision (ICCV), IEEE (2011) pp. 999–1006.Google Scholar
11. Lowe, D., “Object Recognition from Local Scale-Invariant Features,” The Proceedings of the 7th IEEE International Conference on Computer Vision, vol. 2 (1999) pp. 1150–1157.Google Scholar
12. Belongie, S., Malik, J. and Puzicha, J., “Shape matching and object recognition using shape contexts,” IEEE Trans. Pattern Anal. Mach. Intell. 24 (4), 509522 (Apr. 2002). [Online]. Available: http://dx.doi.org/10.1109/34.993558.Google Scholar
13. Frome, A., Huber, D., Kolluri, R., Bulow, T. and Malik, J., “Recognizing Objects in Range Data using Regional Point Descriptors,” Proceedings of the European Conference on Computer Vision (ECCV) (May 2004).Google Scholar
14. Csurka, G., Dance, C. R., Fan, L., Willamowski, J. and Bray, C., “Visual Categorization with Bags of Keypoints,” Workshop on Statistical Learning in Computer Vision, ECCV (2004) pp. 1–22.Google Scholar
15. Serre, T., Wolf, L. and Poggio, T., “Object Recognition with Features Inspired by Visual Cortex,” Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05), ser. CVPR '05, Washington, DC, USA: IEEE Computer Society, vol. 2 (2005) pp. 9941000. [Online]. Available: http://dx.doi.org/10.1109/CVPR.2005.254.Google Scholar
16. Sermanet, P., Eigen, D., Zhang, X., Mathieu, M., Fergus, R. and LeCun, Y., “OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks,” International Conference on Learning Representations (ICLR 2014), CBLS (Apr. 2014). [Online]. Available: http://openreview.net/document/d332e77d-459a-4af8-b3ed-55ba.Google Scholar
17. Razavian, A. S., Azizpour, H., Sullivan, J. and Carlsson, S., “CNN features off-the-shelf: An astounding baseline for recognition,” arXiv preprint arXiv:1403.6382 (2014).Google Scholar
18. Rusu, R. B. and Cousins, S., “3D is here: Point Cloud Library (PCL),” IEEE International Conference on Robotics and Automation (ICRA), Shanghai, China (May 9–13, 2011).CrossRefGoogle Scholar
19. Nemec, B., Abu-Dakka, F., Rytz, J., Savarimuthu, T., Ridge, B., KrÃ...ger, N., Petersen, H., Jouffroy, J. and Ude, A., “Transfer of Assembly Operations to New Workpiece Poses by Adaptation to the Desired Force Profile,” Proceedings of the 16th International Conference Advanced Robotics (ICAR) (2013).Google Scholar
20. Maimone, A. and Fuchs, H., “Reducing Interference Between Multiple Structured Light Depth Sensors using Motion,” Virtual Reality Short Papers and Posters (VRW), 2012 IEEE (Mar. 2012) pp. 51–54.Google Scholar
21. McCamy, C. S., Marcus, H. and Davidson, J., “A color-rendition chart,” J. Appl. Photog. Eng. 2 (3), 9599 (1976).Google Scholar
22. Touati, Y. P., “Image color calibration using a color calibration rig,” University of Southern Denmark, Tech. Rep. (2010).Google Scholar
23. Olesen, S. M., Lyder, S., Kraft, D., Krüger, N. and Jessen, J. B., “Real-time extraction of surface patches with associated uncertainties by means of kinect cameras,” J. Real-Time Image Process. 1–14 (2012). [Online]. Available: http://dx.doi.org/10.1007/s11554-012-0261-x.CrossRefGoogle Scholar
24. HirschmÃ...ller, H., “Accurate and Efficient Stereo Processing by Semi-Global Matching and Mutual Information,” Proceedings of the CVRP, IEEE Computer Society (2005) pp. 807–814.Google Scholar
25. Swain, M. and Ballard, D., “Color indexing,” Int. J. Comput. Vis. (IJCV) 7 (1), 1132 (1991).Google Scholar
26. van de Sande, K. E. A., Gevers, T. and Snoek, C. G. M., “Evaluating color descriptors for object and scene recognition,” IEEE Trans. Pattern Anal. Mach. Intell. 32 (9), 15821596 (2010). [Online]. Available: http://www.science.uva.nl/research/publications/2010/vandeSandeTPAMI2010.Google Scholar
27. Canny, J., “A computational approach to edge detection,” IEEE Trans. Pattern Anal. Mach. Intell. (PAMI) 8 (6), 679698 (1986).Google Scholar
28. Cortes, C. and Vapnik, V., “Support-vector networks,” Machine Learning 20 (3), 273297 (1995).Google Scholar
29. Freund, Y. and Schapire, R. E., “A decision-theoretic generalization of on-line learning and an application to boosting,” J. Comput. Syst. Sci. 55 (1), 119139 (Aug. 1997).Google Scholar
30. Shotton, J., Johnson, M. and Cipolla, R., “Semantic Texton Forests for Image Categorization and Segmentation,” in IEEE Conference on Computer Vision and Pattern Recognition, CVPR (Jun. 2008) pp. 1 –8.Google Scholar
31. Shotton, J., Fitzgibbon, A., Cook, M., Sharp, T., Finocchio, M., Moore, R., Kipman, A. and Blake, A., “Real-Time Human Pose Recognition in Parts from Single Depth Images,” IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (Jun. 2011) pp. 1297–1304.Google Scholar
32. Breiman, L., “Bagging predictors,” Mach. Learn. 24 (2), 123140 (Aug. 1996). [Online]. Available: http://dx.doi.org/10.1023/A:1018054314350.Google Scholar
33. Kasper, A., Xue, Z. and Dillmann, R., “The kit object models database: An object model database for object recognition, localization and manipulation in service robotics,” Int. J. Robot. Res. (IJRR) 31 (8), 927934 (2012).Google Scholar
34. Buch, A. G., Kraft, D., Kamarainen, J.-K., Petersen, H. G. and Kruger, N., “Pose Estimation using Local Structure-Specific Shape and Appearance Context,” IEEE International Conference on Robotics and Automation (ICRA), IEEE (2013) pp. 2080–2087.Google Scholar
35. Besl, P. and McKay, N. D., “A method for registration of 3-d shapes,” IEEE Trans. Pattern Anal. Mach. Intell. 14 (2), 239256 (Feb. 1992).Google Scholar