ICPR 2020

In conjunction with the 25th International Conference on Pattern Recognition (ICPR 2020)

The workshop will be hosted at Milan Congress Center (Mi.Co.), which is located in Piazzale Carlo Magno 1, Milan (now goes online, more information are available at the main conference website)

About the MMDLCA Workshop

Deep learning is now recognized as one of the key software engines that drives the new industrial revolution. The majority of current deep learning research efforts have been dedicated to single-modal data processing. Pronounced manifestations are deep learning based visual recognition and speech recognition. Although significant progress made, single-modal data is often insufficient to derive accurate and robust deep models in many applications. Our digital world is by nature multi-modal, that combines different modalities of data such as text, audio, images, animations, videos and interactive content. Multi-modal is the most popular form for information representation and delivery. For example, posts for hot social events are typically composed of textual descriptions, images and videos. For medical diagnosis, the joint use of medical imaging and textual reports is also essential. Multi-modal data is common for human to make accurate perceptions and decisions. Multi-modal deep learning that is capable of learning from information presented in multiple modalities and consequently making predictions based on multi-modal input is much in demand.

This workshop calls for scientific works that illustrate the most recent progress on multi-modal deep learning. In particular, multi-modal data capture, integration, modelling, understanding and analysis, and how to leverage them to derive accurate and robust AI models in many applications. It is a timely topic following the rapid development of deep learning technologies and their remarkable applications to many fields. It will serve as a forum to bring together active researchers and practitioners to share their recent advances in this exciting area. In particular, we solicit original and high-quality contributions in: (1) presenting state-of-the-art theories and novel application scenarios related to multi-modal deep learning; (2) surveying the recent progress in this area; and (3) developing benchmark datasets and evaluations. We welcome contributions coming from various communities (i.e., visual computing, machine learning, multimedia analysis, distributed and cloud computing, etc.) to submit their novel results.

Accepted papers will be encouraged to submit extended versions of their papers to a special issue of the Machine Vision and Applications journal, under the same theme.

Topics

The list of topics includes, but not limited to:
  • Multi-modal intelligent data acquisition and management
  • Multi-modal benchmark datasets and evaluations
  • Multi-modal representation learning and applications
  • Multi-modal data driven visual analysis and understanding
  • Multi-modal object detection, classification, recognition and segmentation
  • Multi-modal information tracking, retrieval and identification
  • Multi-modal social event analysis
  • Multi-modal medical diagnosis
  • Multi-modal machine learning from incomplete data
  • Deep neural network architectures for multi-modal data processing
  • Multi-modal big data analytics
  • Emerging multi-modal deep learning applications

Program Committee

  • Marco Bertini, Professor, University of Florence, Italy
  • Juan Cao, Professor, Chinese Academy of Sciences, China
  • Jingjing Chen, Associate Professor, Fudan University, China
  • Wen-Huang Cheng, Professor, National Chiao Tung University, Taiwan
  • Huazhu Fu, Senior Scientist, Inception Institute of Artificial Intelligence, Abu Dhabi, UAE
  • Chuang Gan, Research Fellow, MIT, USA
  • Bogdan Ionescu, Professor, University Politehnica of Bucharest, Romania
  • Anan Liu, Professor, Tianjin University, China
  • Symeon (Akis) Papadopoulos, Senior Researcher, CERTH-ITI, Greece
  • Tiberio Uricchio, Research Fellow, University of Florence, Italy
  • Nikolaos V. Boulgouris, Senior Lecturer, Brunel University London, United Kingdom
  • Wei Zhang, Senior Research Scientist, JD AI Research, China

Accepted Papers

Paper ID Paper Title
2 Hierarchical Consistency and Refnement for Semi-supervised Medical Segmentation
3 BVTNet: Multi-label Multi-class Fusion of Visible and Thermal Camera for Free Space and Pedestrian Segmentation
5 Multimodal Emotion Recognition Based on Speech and Physiological Signals Using Deep Neural Networks
6 Cross-modal Deep Learning Applications: Audio-Visual Retrieval
10 Exploiting Word Embeddings for Recognition of Unseen Objects
12 Automated segmentation of lateral ventricle in MR images using multi-scale feature fusion convolutional neural network
13 Visual Word Embedding for Text Classification
16 CC-LSTM: Cross and Conditional Long-Short Time Memory for Video Captioning
18 An Overview of Image-to-Image Translation using Generative Adversarial Networks
20 Fusion Models for Improved Visual Captioning
21 From Bottom to Top: A Coordinated Feature Representation Method for Speech Recognition

MMDLCA 2020 Program

PROGRAM SCHEDULE OF MMDLCA 2020

Monday, January 11, 2021 (CET Time)

12:00 Joining the online conference. Introduction to the technical information (for online participants)
12:00-14:40 Plenary Session Chair
12:00-12:40 Keynote Talk 1
Multimodal Medical Data Analysis: Machine Learning in Histopathology
Henning Muller
Professor at the University of Geneva, Switzerland
Xirong Li,
Renmin University
of China,
China
12:40-13:00 Hierarchical Consistency and Refnement for Semi-supervised Medical Segmentation
Zixiao Wang, Hai Xu, Youliang Tian and Hongtao Xie
University of Science and Technology of China, China
Guizhou Provincial Key Laboratory of Public Big Data, Guizhou University, China
13:00-13:20 BVTNet: Multi-label Multi-class Fusion of Visible and Thermal Camera for Free Space and Pedestrian Segmentation
Vijay John, Ali Boyali, Simon Thompson and Seiichi Mita
Toyota Technological Institute, Japan
Tier IV, Japan
13:20-13:40 Cross-modal Deep Learning Applications: Audio-Visual Retrieval
Cong Jin, Tian Zhang, Shouxun Liu, Yun Tie, Jianguang Li, Wencai Yan and Ming Yn
Communication University of China, China
Zhengzhou University, China
13:40-14:00 Automated segmentation of lateral ventricle in MR images using multi-scale feature fusion convolutional neural network
Fei Ye, Zhiqiang Wang, Kai Hu, Sheng Zhu and Xieping Gao
Xiangtan University, China
Xiangnan University , China
14:00-14:20 From Bottom to Top: A Coordinated Feature Representation Method for Speech Recognition
Lixia Zhou and Jun Zhang
Guangdong University of Technology, China
14:20-16:40 Plenary Session Chair
14:20-15:00 Keynote Talk 2
Vision to Language: from Independency, Interaction, to Symbiosis
Ting Yao
Principal Researcher at JD AI Research, China
Zhineng Chen,
Institute of Automation,
Chinese Academy of Sciences,
China
15:00-15:20 An Overview of Image-to-Image Translation using Generative Adversarial Networks
Xin Chen and Caiyan Jia
Beijing Jiaotong University, China
15:20-15:40 Multimodal Emotion Recognition Based on Speech and Physiological Signals Using Deep Neural Networks
Ali Bakhshi and Stephan Chalup
The University of Newcastle, Australia
15:40-16:00 Exploiting Word Embeddings for Recognition of Unseen Objects
Karan Sharma, Hemanth Dandu, Arun Kumar, Vinay Boddula and Suchendra Bhandarkar
Keysight Technologies, United States
The University of Georgia, United States
16:00-16:20 Visual Word Embedding for Text Classification
Ignazio Gallo, Shah Nawaz, Nicola Landro and Riccardo La Grassainst
University of Insubria, Italy
16:20-16:40 Fusion Models for Improved Visual Captioning
Marimuthu Kalimuthu, Aditya Mogadala, Marius Mosbach and Dietrich Klakow
Saarland Informatics Campus, Saarland University, Germany
16:40 Closing Ceremony

Submission Guidelines

Submissions must be formatted in accordance with the Springer's Computer Science Proceedings guidelines (https://www.springer.com/gp/computer-science/lncs/conference-proceedings-guidelines). The submission is single-blind. Two types of contribution will be considered:

  • Full papers (10-12 pages, including references, oral presentation)
  • Short papers (6-8 pages, including references, poster presentation)

Accepted manuscripts will be included in the ICPR 2020 Workshop Proceedings Springer volume. Once accepted, at least one author is expected to attend the event and orally present the paper (online).

We have setup a submission entry in Easychair. It is OPEN now!

Important Dates

  • Workshop submission deadline: October 17th, 2020
  • Workshop author notification: November 10th, 2020
  • Camera-ready submission: November 15th, 2020
  • Finalized workshop program: December 1st, 2020

Workshop Organizers

zhineng.chen@ia.ac.cn
Dr. Zhineng Chen, Associate Professor, Chinese Academy of Sciences, China.
xirong@ruc.edu.cn
Dr. Xirong Li, Associate Professor, Renmin University of China, China.
e.gavves@uva.nl
Dr. Efstratios Gavves, Associate Professor, University of Amsterdam, Netherlands.
may4mc@gmail.com
Dr. Mei Chen, Principal Researcher Manager, Microsoft Cloud & AI, USA.
ikom@iti.gr
Dr. Ioannis (Yiannis) Kompatsiaris, Research Director, CERTH-ITI, Greece.

Sponsors

AL4media
ICPR

ICPR 2020 Workshop © MMDLCA