MMDLCA 2020

In conjunction with the 25th International Conference on Pattern Recognition (ICPR 2020)

The workshop will be hosted at Milan Congress Center (Mi.Co.), which is located in Piazzale Carlo Magno 1, Milan (now goes online, more information are available at the main conference website)

About the MMDLCA Workshop

Deep learning is now recognized as one of the key software engines that drives the new industrial revolution. The majority of current deep learning research efforts have been dedicated to single-modal data processing. Pronounced manifestations are deep learning based visual recognition and speech recognition. Although significant progress made, single-modal data is often insufficient to derive accurate and robust deep models in many applications. Our digital world is by nature multi-modal, that combines different modalities of data such as text, audio, images, animations, videos and interactive content. Multi-modal is the most popular form for information representation and delivery. For example, posts for hot social events are typically composed of textual descriptions, images and videos. For medical diagnosis, the joint use of medical imaging and textual reports is also essential. Multi-modal data is common for human to make accurate perceptions and decisions. Multi-modal deep learning that is capable of learning from information presented in multiple modalities and consequently making predictions based on multi-modal input is much in demand.

This workshop calls for scientific works that illustrate the most recent progress on multi-modal deep learning. In particular, multi-modal data capture, integration, modelling, understanding and analysis, and how to leverage them to derive accurate and robust AI models in many applications. It is a timely topic following the rapid development of deep learning technologies and their remarkable applications to many fields. It will serve as a forum to bring together active researchers and practitioners to share their recent advances in this exciting area. In particular, we solicit original and high-quality contributions in: (1) presenting state-of-the-art theories and novel application scenarios related to multi-modal deep learning; (2) surveying the recent progress in this area; and (3) developing benchmark datasets and evaluations. We welcome contributions coming from various communities (i.e., visual computing, machine learning, multimedia analysis, distributed and cloud computing, etc.) to submit their novel results.

Accepted papers will be encouraged to submit extended versions of their papers to a special issue of the Machine Vision and Applications journal, under the same theme.

Topics

The list of topics includes, but not limited to:

Multi-modal intelligent data acquisition and management
Multi-modal benchmark datasets and evaluations
Multi-modal representation learning and applications
Multi-modal data driven visual analysis and understanding
Multi-modal object detection, classification, recognition and segmentation
Multi-modal information tracking, retrieval and identification
Multi-modal social event analysis
Multi-modal medical diagnosis
Multi-modal machine learning from incomplete data
Deep neural network architectures for multi-modal data processing
Multi-modal big data analytics
Emerging multi-modal deep learning applications

Program Committee

Marco Bertini, Professor, University of Florence, Italy
Juan Cao, Professor, Chinese Academy of Sciences, China
Jingjing Chen, Associate Professor, Fudan University, China
Wen-Huang Cheng, Professor, National Chiao Tung University, Taiwan
Huazhu Fu, Senior Scientist, Inception Institute of Artificial Intelligence, Abu Dhabi, UAE
Chuang Gan, Research Fellow, MIT, USA
Bogdan Ionescu, Professor, University Politehnica of Bucharest, Romania
Anan Liu, Professor, Tianjin University, China
Symeon (Akis) Papadopoulos, Senior Researcher, CERTH-ITI, Greece
Tiberio Uricchio, Research Fellow, University of Florence, Italy
Nikolaos V. Boulgouris, Senior Lecturer, Brunel University London, United Kingdom
Wei Zhang, Senior Research Scientist, JD AI Research, China

Accepted Papers

Paper ID	Paper Title
2	Hierarchical Consistency and Refnement for Semi-supervised Medical Segmentation
3	BVTNet: Multi-label Multi-class Fusion of Visible and Thermal Camera for Free Space and Pedestrian Segmentation
5	Multimodal Emotion Recognition Based on Speech and Physiological Signals Using Deep Neural Networks
6	Cross-modal Deep Learning Applications: Audio-Visual Retrieval
10	Exploiting Word Embeddings for Recognition of Unseen Objects
12	Automated segmentation of lateral ventricle in MR images using multi-scale feature fusion convolutional neural network
13	Visual Word Embedding for Text Classification
16	CC-LSTM: Cross and Conditional Long-Short Time Memory for Video Captioning
18	An Overview of Image-to-Image Translation using Generative Adversarial Networks
20	Fusion Models for Improved Visual Captioning
21	From Bottom to Top: A Coordinated Feature Representation Method for Speech Recognition

MMDLCA 2020 Program

PROGRAM SCHEDULE OF MMDLCA 2020
Monday, January 11, 2021 (CET Time)
12:00	Joining the online conference. Introduction to the technical information (for online participants)
12:00-14:40	Plenary Session	Chair
12:00-12:40	Keynote Talk 1 Multimodal Medical Data Analysis: Machine Learning in Histopathology Henning Muller Professor at the University of Geneva, Switzerland	Xirong Li, Renmin University of China, China
12:40-13:00	Hierarchical Consistency and Refnement for Semi-supervised Medical Segmentation Zixiao Wang, Hai Xu, Youliang Tian and Hongtao Xie University of Science and Technology of China, China Guizhou Provincial Key Laboratory of Public Big Data, Guizhou University, China
13:00-13:20	BVTNet: Multi-label Multi-class Fusion of Visible and Thermal Camera for Free Space and Pedestrian Segmentation Vijay John, Ali Boyali, Simon Thompson and Seiichi Mita Toyota Technological Institute, Japan Tier IV, Japan
13:20-13:40	Cross-modal Deep Learning Applications: Audio-Visual Retrieval Cong Jin, Tian Zhang, Shouxun Liu, Yun Tie, Jianguang Li, Wencai Yan and Ming Yn Communication University of China, China Zhengzhou University, China
13:40-14:00	Automated segmentation of lateral ventricle in MR images using multi-scale feature fusion convolutional neural network Fei Ye, Zhiqiang Wang, Kai Hu, Sheng Zhu and Xieping Gao Xiangtan University, China Xiangnan University , China
14:00-14:20	From Bottom to Top: A Coordinated Feature Representation Method for Speech Recognition Lixia Zhou and Jun Zhang Guangdong University of Technology, China
14:20-16:40	Plenary Session		Chair
14:20-15:00	Keynote Talk 2 Vision to Language: from Independency, Interaction, to Symbiosis Ting Yao Principal Researcher at JD AI Research, China	Zhineng Chen, Institute of Automation, Chinese Academy of Sciences, China
15:00-15:20	An Overview of Image-to-Image Translation using Generative Adversarial Networks Xin Chen and Caiyan Jia Beijing Jiaotong University, China
15:20-15:40	Multimodal Emotion Recognition Based on Speech and Physiological Signals Using Deep Neural Networks Ali Bakhshi and Stephan Chalup The University of Newcastle, Australia
15:40-16:00	Exploiting Word Embeddings for Recognition of Unseen Objects Karan Sharma, Hemanth Dandu, Arun Kumar, Vinay Boddula and Suchendra Bhandarkar Keysight Technologies, United States The University of Georgia, United States
16:00-16:20	Visual Word Embedding for Text Classification Ignazio Gallo, Shah Nawaz, Nicola Landro and Riccardo La Grassainst University of Insubria, Italy
16:20-16:40	Fusion Models for Improved Visual Captioning Marimuthu Kalimuthu, Aditya Mogadala, Marius Mosbach and Dietrich Klakow Saarland Informatics Campus, Saarland University, Germany
16:40	Closing Ceremony

Submission Guidelines

Submissions must be formatted in accordance with the Springer's Computer Science Proceedings guidelines (https://www.springer.com/gp/computer-science/lncs/conference-proceedings-guidelines). The submission is single-blind. Two types of contribution will be considered:

Full papers (10-12 pages, including references, oral presentation)
Short papers (6-8 pages, including references, poster presentation)

Accepted manuscripts will be included in the ICPR 2020 Workshop Proceedings Springer volume. Once accepted, at least one author is expected to attend the event and orally present the paper (online).

We have setup a submission entry in Easychair. It is OPEN now!

Important Dates

Workshop submission deadline: October 17th, 2020
Workshop author notification: November 10th, 2020
Camera-ready submission: November 15th, 2020
Finalized workshop program: December 1st, 2020

Workshop Organizers

zhineng.chen@ia.ac.cn: Dr. Zhineng Chen, Associate Professor, Chinese Academy of Sciences, China.
xirong@ruc.edu.cn: Dr. Xirong Li, Associate Professor, Renmin University of China, China.
e.gavves@uva.nl: Dr. Efstratios Gavves, Associate Professor, University of Amsterdam, Netherlands.
may4mc@gmail.com: Dr. Mei Chen, Principal Researcher Manager, Microsoft Cloud & AI, USA.
ikom@iti.gr: Dr. Ioannis (Yiannis) Kompatsiaris, Research Director, CERTH-ITI, Greece.

ICPR 2020