In recent years, driven by the overall environment of Internet technology, policies, and epidemics, the demand for online consultation is growing rapidly, and the research results and applications of natural language processing (NLP) technology in this field are increasingly being implemented. Although intelligent consultation has been able to achieve pre diagnosis, many times the diagnostic results given by the system differ greatly from the actual situation. For example, when we can accurately give symptoms such as “body temperature of 38.5 degrees Celsius, a slight fever, and limb weakness,” the system will be easier to query based on the information provided, providing relatively accurate basic diagnosis such as colds and viral infections; However, if vague complaints such as “fatigue”, or even “fatigue”, “fatigue”, “chest pain” are given, the intelligent diagnostic system may be powerless.
This is because, at present, there is still a significant gap between computers and humans in the accuracy and depth of text understanding, especially in the medical field, which not only requires computers to learn a large number of professional terms and form a knowledge map; More importantly, it is important to be able to understand the vague subjective complaints of patients without professional knowledge about their symptoms and associate them with professional terminology.
In this process, researchers not only need to “feed” huge professional corpus and daily knowledge to the algorithm, improve algorithm capabilities, and enhance AI’s understanding of the real world, but also need to use better strategies, select appropriate models, and optimize the current problems facing the medical NLP field, This is also the main difficulty to be solved in the “Intelligent Medical Diagnosis Race Track” of the 20th China Conference on Computational Linguistics (hereinafter referred to as CCL2021) Intelligent Medical Dialogue Diagnosis and Treatment Evaluation.
In this track, the scheme submitted by Tencent Tianyan Laboratory team successfully won the first place on the track with high disease prediction accuracy and symptom recall rate. Next, let’s take a look at how this scheme performs algorithm thinking and model selection.
Task difficulty: Let the algorithm quickly understand “patients”
The task of the “Intelligent Medical Diagnosis” circuit is to develop an interactive program that simulates the actual consultation process, using the program to “compete” with patient simulators with over 2000 sets of doctor-patient conversation samples: first, to compete with the baseline model provided by the sponsor to determine the initial symptoms of the “patient”; Then, based on this information, output questions that can further obtain effective information, and ask the “patient” about their symptoms; Finally, in no more than 11 interactions, the disease and symptoms of the “patient” were identified. Competition results are also determined based on diagnostic accuracy and symptom recall rates.
The difficulty is that 2000 sets of conversation samples each contain a large amount of data information: disease category, patient’s private complaint text, direct information (entity information and symptoms explicitly mentioned in the patient’s private complaint), and even hidden information (it is necessary to combine the entire doctor-patient conversation to obtain the entity and label to determine whether the patient already has the symptoms). Moreover, like real world patients, machine “patients” do not express symptoms clearly at once, such as multiple descriptions of a single symptom or other main complaints.
The algorithm developed by the competitor and the corresponding algorithm selection model should not only be able to “read” the symptoms that are “vaguely described”, but also quickly classify the symptoms; Based on the currently inquired patient information, it is also necessary to accurately determine what other symptoms the “patient” may have, in order to increase the output of effective information from the “patient” in the limited consultation interaction, thereby maximizing the accuracy of disease diagnosis and the recall rate of symptoms.
Therefore, this task not only tests the ability of the algorithm, but also tests the matching strategy of the algorithm and model to improve the accuracy and efficiency of the program consultation.
Solution and countermeasure: more efficient algorithm+more suitable model to improve reasoning speed
In order to enable AI to better understand “patient” information, Tencent Tianyan Lab uses multiple NLP and machine learning technologies such as search, question and answer, pre training, and classification to develop programs. The overall program is divided into two sections: symptom inquiry and disease prediction. Each section adopts the same model prediction scheme. At the same time, each section is subdivided into three parts: searching historical cases based on retrieval, symptom/disease prediction based on natural language Symptom based symptom/disease prediction (as shown in the figure). These three parts will run simultaneously within the same interaction cycle, and will be “calibrated” through a weighting algorithm to obtain symptoms that require further inquiry or diseases that require output diagnosis.
Symptom Inquiry Prediction Framework Chart
Based on the retrieval and query of historical cases, technologies such as precision search, fuzzy search, and Bayesian inference are used to search for similar cases in the algorithmic database. The advantage of this approach is that it not only combines fuzzy and precise representations of the main complaint symptoms, to broaden the search scope for the main complaint symptoms and diseases, but also more efficiently predicts symptoms.
Disease prediction based on natural language is to use a pre trained language model to predict the probability distribution of query symptoms after converting the symptom list into natural language. It is worth noting that the model used by this part of the contestants is MedBERT, a large-scale medical pre training language model owned by Tianyan Laboratory. It is based on large-scale medical online text and is continuously trained by Robert. It not only can better adapt to language learning in the medical field, but also has obtained SOTA on multiple medical standard datasets. Compared to the general pre training model, MedBERT is more competent in performing medical related tasks.
In the prediction of symptoms/diseases, the scheme uses the xgboost model, a classifier that has been validated in multiple competitions and has excellent classification performance. Its advantage is to make the learned model simpler and prevent overfitting. Therefore, it further improves the operational efficiency of the algorithm.
Disease prediction framework
The multi strategy fusion recall prediction method not only complements the advantages of three models: retrieval, natural language disease prediction, and symptom disease prediction, achieving higher accuracy and symptom recall rates. At the same time, in symptom recall, it can also encourage more rounds of symptom inquiries, and do good hyperparametric configuration tuning to achieve higher symptom recall rates. For this reason, in the final evaluation, Tianyan Lab achieved the highest overall score in terms of disease prediction accuracy and symptom recall rate, and even exceeded other team plans by more than 10% in symptom recall rate.
The achievement not only shows that Tianyan Laboratory has comparative advantages in algorithm capabilities and models, but also reflects the strength of Tianyan Laboratory in the field of AI algorithm research and application in the medical and health field for many years.
Tencent Tianyan Laboratory has been focusing on NLP research in the field of medical and health, and its achievements have been successfully implemented in the business segments of guidance and auxiliary diagnosis, rational medication, and health assistance in Tencent Internet Hospital. At the same time, Tianyan Laboratory also expects to promote innovative research across NLP at the industry level: for example, to hold the MLPCP Challenge (International Challenge for Medical Dialogue Generation and Automatic Diagnosis) at the Deep Learning Summit ICLR 2021 to promote innovative breakthroughs in medical consultation dialogue systems and predict the possible types of diseases of patients; Jointly with CCKS 2021 (National Knowledge Mapping and Semantic Computing Conference) and Sun Yat-sen University, we will organize the generation and evaluation of Chinese medical dialogues containing entities to assist in research innovation and algorithm capability improvement in fields such as natural language foundation, language understanding, information extraction, and knowledge mapping construction… In the future, Tianyan Laboratory will continue to take root in the field of medical and health, and continue to explore and promote the implementation of more value in academic research and application directions in the NLP field.