BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/New_York
X-LIC-LOCATION:America/New_York
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20210402T160552Z
LOCATION:Track 3
DTSTART;TZID=America/New_York:20201113T161000
DTEND;TZID=America/New_York:20201113T164000
UID:submissions.supercomputing.org_SC20_sess222_ws_cafcw118@linklings.com
SUMMARY:Integration of Domain Knowledge Using Medical Knowledge Graph Deep
  Learning for Cancer Phenotyping
DESCRIPTION:Workshop\n\nIntegration of Domain Knowledge Using Medical Know
 ledge Graph Deep Learning for Cancer Phenotyping\n\nAlawad\n\nA key compon
 ent of deep learning (DL) for natural language processing (NLP) is word em
 beddings. Word embeddings that effectively capture the meaning and context
  of the word that they represent can significantly improve the performance
  of downstream DL models for various NLP tasks. Many existing word embeddi
 ngs techniques capture the context of words based on word co-occurrence in
  documents and text; however, they often cannot capture broader domain-spe
 cific relationships between concepts that may be crucial for the NLP task 
 at hand. In this paper, we propose a method to integrate external knowledg
 e from medical terminology ontologies into the context captured by word em
 beddings. Specifically, we use a medical knowledge graph, such as the unif
 ied medical language system (UMLS), to find connections between clinical t
 erms in cancer pathology reports. This approach aims to minimize the dista
 nce between connected clinical concepts. We evaluate the proposed approach
  using a Multitask Convolutional Neural Network (MT-CNN) to extract six ca
 ncer characteristics – site, subsite, laterality, behavior, histology, and
  grade – from a dataset of 900K cancer pathology reports. The results show
  that the MT-CNN model which uses our domain informed embeddings outperfor
 ms the same MT-CNN using standard word2vec embeddings across all tasks, wi
 th an improvement in the overall micro- and macro-F1 scores by 4.97% and 2
 2.5%, respectively.\n\nRegistration Category: Workshop Reg Pass
END:VEVENT
END:VCALENDAR

