Dr. Dipak Rimal
Dr. Dipak RimalBeeCorp, IN, United States
Division Chair
Data Science/QIS/AI Division provides a forum to exchange knowledge in the technology domains including Data Science, Machine learning, Quantum Information Science, and artificial intelligence. These emerging technologies have the potential to impact every aspect of our life. The division strives to provide a forum to discuss research topics, exchange ideas and experiences from academia and industries, and inspire collaborations.

Abstract Submission Closed!

Deadline: June 1st, 2021

Thank you for your support. We will update you with the latest programs soon!

Invited Speaker

Dr. Kamal Dhungana
Dr. Kamal DhunganaIN10T Inc.(INTENT) Greater St. Louis, United States

Starting a Career in Data Science

Abstract:

Data science is a part of Artificial Intelligence, and the goal of data science is to build or develop the means to solve any business-related problems using data. With the recent development of technologies, especially electronics and storage devices, the data collection process becomes simple, and companies have been able to gather a massive amount of data in a short period of time. As a result, the scope of data science has increased immensely in recent years. By its nature, data science is an interdisciplinary field, and people having different academic and professional experiences can contribute to this field. This interdisciplinary nature sometimes makes the hiring process of data scientists/engineers a bit challenging. In this presentation, I will discuss some of the issues that have been faced by aspirant data scientists in very recent years, and I will also discuss about the challenges and opportunities that we encounter in our earlier career in this field.

Invited Speaker

Dr. Junhao Wen
Dr. Junhao WenUniversity of Pennsylvania, United States

Neuroimaging for diagnosis and tracking of neurodegenerative diseases: from univariate statistics to multivariate machine learning

Abstract:

Biomarker identification and tracking in dementia are essential to better understand the pathological mechanism and disease trajectory. First, we aim to identify the most promising biomarkers at the presymptomatic stage of dementia. More specifically, we studied this in the case of genetic frontotemporal lobar degeneration due to C9orf72 mutation. Secondly, to advance early diagnosis and prognosis by using machine learning methods with magnetic resonance imaging data, we tackle this in the context of sporadic Alzheimer’s disease. For the C9 mutation, biomarkers were identified from conventional T1-weighted MRI and diffusion tensor imaging model. We then compared the sensitivity and specificity of the advanced NODDI model and to that of conventional techniques, namely T1-weighted MRI and DTI. The second part focuses on the early diagnosis of AD. We propose an open-source framework for reproducible evaluation of AD classification using diffusion MRI and conventional ML methods, and extend this framework to deep learning methods and demonstrates its use on T1-weighted MRI.

Session Schedule

Date/Time:
ET: July 16, 2021 04:30 PM
Nepal: July 17, 2021 02:15 AM
Abstract Number: ANPA2021_0101

Presenting Author: Kamal Dhungana (Invited)

Title: Starting a Career in Data Science

Show/Hide Abstract

Data science is a part of Artificial Intelligence, and the goal of data science is to build or develop the means to solve any business-related problems using data. With the recent development of technologies, especially electronics and storage devices, data collection process become simple, and companies have been able to gather massive amount of data in short period of time. As a result, the scope of data science has increased immensely in recent years. By its nature, data science is an interdisciplinary field, and people having different academic and professional experiences can contribute to this field. This interdisciplinary nature sometime makes the hiring process of data scientists/engineers a bit challenging. In this presentation, I will discuss some of the issues that have been faced by aspirant data scientists in very recent years, and I will also discuss about the challenges and opportunities that we encounter in our earlier career in this field.
Date/Time:
ET: July 16, 2021 05:00 PM
Nepal: July 17, 2021 02:45 AM
Abstract Number: ANPA2021_0102

Presenting Author: Unab Javed

Title: Decoding Gas Mixtures Using Machine Learning

Show/Hide Abstract

We utilize machine learning models to predict the composition of unknown gas mixtures from the output of an array of electrochemical sensors. The sensors are used to produce voltage responses in the presence of complex gas mixtures. The voltages of the sensors are used as inputs to our machine learning pipeline which first predicts which gases are present at non-zero concentrations in the mixtureusing multi-class classification. It then predicts these gas concentrations in the second step. Our model is able to classify and quantify the concentrations of the gas species with high accuracy using only one voltage reading from each sensor. This framework can be easily expanded to include more gases and can be used in automotive, environmental and various industrial settings
Date/Time:
ET: July 16, 2021 05:15 PM
Nepal: July 17, 2021 03:00 AM
Abstract Number: ANPA2021_0103

Presenting Author: Dibakar Sigdel

Title: Blockchain as a bridge to talk to your body cells.

Show/Hide Abstract

Rapid development in cryptography and blockchain has necessitated the deep understanding as well as offering the possible application of this technology in the diverse field e.g., decentralized finance (DeFI), nun fungible tokens (NFT), decentralized internet (web 3.0) and new generation database systems (IPFS). High precision medicine is one of the active research areas where one seeks to implement personalized medicine based on person-specific omics data (genomics, transcriptomics, proteomics, metabolomics) to delineate molecular mechanisms for disease diagnosis and therapeutics.

Brain-computer interface (e.g., Neuralink) is another active research area where one needs to translate molecular mechanism-based cell to cell communication language into an electronic silico-language. Handling these personal data and making them interoperable through new-generation personal applications (e.g., web 3.0) remains increasingly challenging. Blockchain-based decentralized data storage and security looks promising not only for security and interoperability but also for advanced computations.

In this talk, the current development and challenges in high precision medicine as well as the brain-computer interface will be presented from the perspective of molecular pathways and cell to cell communication. The main focus is on how the new data management and computations offered by blockchain technology could be a cornerstone in the new generation of the technological revolution.

Date/Time:
ET: July 16, 2021 05:30 PM
Nepal: July 17, 2021 03:15 AM
Abstract Number: ANPA2021_0104

Presenting Author: Hari Khanal

Title: Neural network to solve Partial Differential Equations

Show/Hide Abstract

Solving the partial differential equation (PDF) is very essential and ubiquitous in science and engineering, and it is difficult to solve when problems describe complex phenomenons involving many independent variables. Approximate methods can be used to solve them, but it takes millions of CPU hours to sort out complicated PDE. In this talk, I will mention newly built techniques to solve the PDF using an artificial neural network that can approximate the PDF solution in the order of magnitude faster than the traditional technique.
Date/Time:
ET: July 16, 2021 06:00 PM
Nepal: July 17, 2021 03:45 AM
Abstract Number: ANPA2021_0105

Presenting Author: Junhao WEN (Invited)

Title: Neuroimaging for diagnosis and tracking of neurodegenerative diseases: from univariate statistics to multivariate machine learning

Show/Hide Abstract

Biomarker identification and tracking in dementia are essential to better understand the pathological mechanism and disease trajectory. First, we aim to identify the most promising biomarkers at the presymptomatic stage of dementia. More specifically, we studied this in the case of genetic frontotemporal lobar degeneration due to C9orf72 mutation. Secondly, to advance early diagnosis and prognosis by using machine learning methods with magnetic resonance imaging data, we tackle this in the context of sporadic Alzheimer�s disease. For the C9 mutation, biomarkers were identified from conventional T1-weighted MRI and diffusion tensor imaging model. We then compared the sensitivity and specificity of the advanced NODDI model and to that of conventional techniques, namely T1-weighted MRI and DTI. The second part focuses on the early diagnosis of AD. We propose an open-source framework for reproducible evaluation of AD classification using diffusion MRI and conventional ML methods, and extend this framework to deep learning methods and demonstrates its use on T1-weighted MRI.
Date/Time:
ET: July 16, 2021 06:30 PM
Nepal: July 17, 2021 04:15 AM
Abstract Number: ANPA2021_0106

Presenting Author: Amartya Singh

Title: Exploring tumor heterogeneity at the transcriptomic and epigenomic level using a novel biclustering approach

Show/Hide Abstract

Alterations in gene expression patterns in tumor cells play a key role in tumor development and progression. Molecular classification of cancers based on gene expression has lead to a tremendous improvement in treatment outcomes for several cancer types (for example, breast cancer). However, it is not possible to characterize the extensive heterogeneity, both within and across different tumor types, using traditional clustering approaches. Exploring this heterogeneity and diversity in the aberrant signatures that may have a direct impact on treatment outcomes can help identify biomarkers of clinical relevance. Over the past decade, high-throughput sequencing technologies have enabled comprehensive studies of the genome, the transcriptome, and even the epigenome. Massive collaborative efforts, such as the ones undertaken by The Cancer Genome Atlas (TCGA), have enabled the generation of large transcriptomic (RNA-seq) and epigenomic (DNA methylation � Illumina 450k) data sets for multiple cancer types. Unsupervised approaches that enable the identification of clinically relevant alterations in the transcriptome and the epigenome can shed some light on the precise nature of these alterations and their complex interplay within tumors.

We developed a biclustering method, called the Tunable Biclustering Algorithm (TuBA), to analyze large gene expression data sets. Based on a novel proximity measure that is designed to leverage the size of the data sets instead of relying on actual gene expression values (thereby ameliorating spurious impacts of technical and biological noise), TuBA preferentially identifies subsets of tumor samples that exhibit aberrant co-expression of subsets of genes due to some shared underlying mechanism(s). We have applied TuBA to several independent gene expression data sets of multiple cancer types and have showed that several of the co-expression signatures identified by TuBA were associated with clinical outcomes. Moreover, we were able to associate several of these co-expression signatures with alterations (gains or losses) in the copy numbers of the corresponding genes. The aim of this multi-cancer study is to seek: (i) alterations that are shared across multiple cancer types, and also (ii) alterations that are unique/specific to a given cancer type. Identification of both kinds of alterations can lead us towards potential biomarkers of clinical significance.

Given the simple nature of TuBA�s proximity measure, it is straightforward to adapt and use TuBA to analyze DNA methylation data sets. Our aim for biclustering DNA methylation data using TuBA, is to help us integrate alterations in the epigenome with alterations in gene expression in a completely unsupervised manner. This can even help us discover concerted alterations in CpG profiles that are responsible for some of the altered gene co-expression patterns found in gene expression data.

Date/Time:
ET: July 16, 2021 06:45 PM
Nepal: July 17, 2021 04:30 AM
Abstract Number: ANPA2021_0108

Presenting Author: Kiran Khanal

Title: Recommendation System Using Node Embeddings

Show/Hide Abstract

Node embeddings are compressed representation of a graph topology while preserving important network features. Unsupervised feature learning technique using components from stellargraph library for Keras implementation of Node2Vec algorithm was used to acquire node embeddings of recipe data. Node2vec feature learning maps the nodes of a graph network to a low-dimensional space implementing second order biased random walk. Random walk maximizes the likelihood of preserving network neighborhood of nodes while exploring diverse neighborhood efficiently and flexibility. The recipe-based recommendation system was developed using node embeddings of graph network created from recipe data and implemented it in a web application. The pre-trained model has performed well for recommending a closely linked item for a search query.