search for




 

Analysis of Dental Language Network in News Articles Using News Big Data
Int J Clin Prev Dent 2020;16(4):159-164
Published online December 31, 2020;  https://doi.org/10.15236/ijcpd.2020.16.4.159
© 2020 International Journal of Clinical Preventive Dentistry.

Kyung-Hui Moon

Department of Dental Hygiene, Jinju Health College, Jinju, Korea
Correspondence to: Kyung-Hui Moon
E-mail: next77_kr@naver.com
https://orcid.org/0000-0002-4584-4237
Received December 2, 2020; Revised December 9, 2020; Accepted December 12, 2020.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Abstract
Objective: This study attempted to analyze the ‘dental’ language network in news articles by using big data provided by social media about dentistry. In order to achieve the purpose of this study, the resources from a total of 54 press companies including Kyunghyang Daily News were collected from January 1st to June 30th, 2020 through Big Kinds and used as data.
Methods: It was analyzed by using Microsoft Excel 2016 (Microsoft, USA) and NetMiner (ver. 4.4.1; Cyram Inc., Korea). Keyword analysis was conducted to identify the language network of dental-related news articles and centrality analysis was conducted to identify keywords.
Results: The most frequently mentioned word among the total of 21,549 words from dental-related articles was ‘dentistry’ and articles about ‘hospital’, ‘medical’, ‘Corona’ and ‘patient’ occupied a large proportion. The top degree centrality words were ‘dentistry’, ‘medical’, ‘hospital’ and ‘region’ while the top betweenness centrality words were ‘dentistry’, ‘medical’, ‘region’, ‘hospital’,and ‘Corona’. Also, the top closeness centrality words were ‘dentistry’, ‘medical’, ‘hospital’, ‘region’,and ‘health’.
Conclusion: In the news articles during the first half of 2020 that were extracted by Big Kinds in response to the search query: ‘dentistry’, the most frequently mentioned word was ‘dentistry’, and articles on ‘hospital’, ‘medical’, ‘Corona’, and ‘patient’ accounted for a great proportion, thus showing that the overall confusion in society caused by COVID-19 is affected by the structural characteristics of COVID-19 outbreak areas especially in the dental field as well. Further studies on the changes of dental industry in Korea that were caused by COVID-19 pandemic in 2020 are expected.
Keywords : dentistry, news articles, language network analysis, big data
Introduction

Korea’s economic development and industrialization have greatly improved the national economy and the level of education and also brought about many changes in the medical field [1]. With the recent development of the Internet and mass media, the public easily accesses and shares a large amount of information. The Internet and media have become popular among the public in a short time and have made life easier for people in many way. This situation is not much different in dental field.

This can offer a guideline that can make various policies even in the face of rapidly changing environment. Despite the frequent mention of dentistry as a keyword in the media, both domestically and internationally, there has been no analysis on the nature or attributes of this language itself [2]. In recent years, research on language networks has been actively conducted in various fields of industry to analyze the properties of texts and to reveal the structure of intellectual patterns of texts by extracting keywords from primary texts and forming networks among keywords [3]. However, in the field of dentistry, such research is insufficient. Therefore, the purpose of this study is to analyze how the media, which most people use as the main channel of information source, delivers the issue of dentistry. In particular, this study analyzes whether the media is providing balanced information from the viewpoint of development and conservation [4] and tries to derive impli-cations based on this.

Through NetMiner (ver. 4.4.1; Cyram Inc., Seongnam, Korea), the big data analytics program, various and rational data processing was carried out for keyword analysis. As a result, keywords were identified, and centrality analysis was performed to find the language network of dental-related news articles. Accordingly, the purpose of this study is to analyze the dental-related news articles in the first half of 2020 with keywords, and based on the analysis results, to derive meaningful implications on what information needs to be provided to establish a strategy for balanced information delivery to the media, which is the premise of the rational development of dentistry in the future.

Materials and Methods

1. Subjects

This study used the news articles provided by Big Kinds (www.bigkinds.or.kr), a news big data service of the Korea Press Foundation for analysis. With ‘dentistry’ as the search keyword, 2,278 news articles reported by a total of 54 media companies during the first half of 2020 (from January 1 to June 30) were found and analyzed.

2. Methods

In this study, the data collected from Big Kinds was analyzed by using Microsoft Excel 2016 (Microsoft, Redmond, WA, USA) and NetMiner. NetMiner is a social network analytics program that provides keywords for searched infor-mation and provides complex information for future research by providing not only cross-sectional analysis of the analysis results, but also dynamic networks and underlying patterns of keywords [5]. Therefore, the keyword analysis was conducted through Netminer to identify the language network of dental-related news articles, and centrality analysis was performed to find the keywords and the language network struc-ture.

Results

1. Top-ranked keyword analysis

A total of 2,278 news articles posted during the first half of 2020 were extracted by Big Kinds in response to the search query: ‘dentistry’, and a total of 21,549 words were found in those news articles. Among them, the keywords whose meaning is difficult to confirm because the word length is one letter word were excluded. In addition, in order to identify the top-ranked keywords that frequently appear in certain news articles, except for common words appearing in most news articles, only keywords with a Term Frequency-Inverse Docu-ment Frequency (TF-IDF) weight of 0.5 or more were extract-ed. The top 25 high frequency words are as follows (Table 1). It shows that the most frequently mentioned word was ‘dentistry’, and articles on ‘hospital’, ‘medical’, ‘Corona’, and ‘patient’ accounted for a great proportion.

Table 1 . The top 25 high frequency word list

RankKeywordFrequency
1Dentistry5,849
2Hospital4,244
3Medical4,197
4Corona3,281
5Patient2,568
6Treatment2,480
7Examination2,223
8Health2,109
9Teeth2,092
10Region2,044
11Support1,909
12Implant1,905
13Business1,788
14Community1,734
15Health care1,651
16Confirmed1,615
17Doctor1,567
18Oral1,480
19Potential1,450
20Institution1,408
21Center1,408
22Management1,302
23Representative1,300
24Human1,279
25Infection1,272


2. Network centrality analysis

Out of the top-ranked keywords, the keyword network analysis was carried out on the top 100 high frequency keywords with more than 5 times of co-occurrences. Table 2 shows the results of examining degree centrality, betweenness centrality, and closeness centrality to find the influence and relationship of keywords in news articles.

Table 2 . Centrality analysis of the network structure

RankDegree centralityBetweenness centralityCloseness centrality



KeywordCentralityKeywordCentralityKeywordCentrality
1Dentistry0.808Dentistry0.083Dentistry0.839
2Medical0.727Medical0.058Medical0.786
3Hospital0.697Region0.046Hospital0.767
4Region0.636Hospital0.045Region0.733
5Health0.545Corona0.035Health0.688
6Corona0.545Patient0.029Corona0.688
7Patient0.535Health0.028Patient0.683
8Examination0.525Support0.027Examination0.678
9Support0.515Potential0.022Support0.673
10Treatment0.515Treatment0.021Treatment0.673
11Potential0.485Implant0.021Potential0.660
12Implant0.475Examination0.019Implant0.656
13Business0.404Business0.012Business0.627
14Institution0.384Market0.011Institution0.619
15Doctor0.374Institution0.011Doctor0.615
16Management0.364Community0.011Management0.611
17Education0.364Oral0.010Education0.611
18Oral0.364Government0.010Oral0.611
19Community0.354Doctor0.010Community0.607
20Progress0.354Education0.009Progress0.607


First, degree centrality is a measure of how many connections keywords have in the network [6]. In other words, it shows which keywords have and how many key connections to dentistry. As a result of the degree centrality analysis, it was found that in the top keyword group, ‘dentistry’ had the highest degree centrality, followed by ‘medical’, ‘hospital’, and ‘region’. Also, the degree centrality was high in the order of ‘health’, ‘Corona’, ‘patient’, ‘examination’, ‘support’, and ‘treatment’.

Second, betweenness centrality is a measure of how much one keyword plays the role of intermediary with another keyword in building a network [6]. As a result of the betweenness centrality analysis, it was found that ‘dentistry’ had the highest betweenness centrality, followed by ‘medical’, ‘region’, ‘hospital’, ‘Corona’, ‘patient’, and ‘health’. When comparing the betweenness centrality of the top group with the degree centrality of the top group, the keywords, ‘dentistry’ and ‘medical’ appeared in the same ranking, while ‘region’ and ‘Corona’ were ranked higher in betweenness centrality than degree centrality.

Third, closeness centrality indicates how close a keyword is to all other keywords in the network [6]. As a result of the closeness centrality analysis, it was found that in case of the top group, ‘dentistry’ had the highest closeness centrality, followed by ‘medical’, ‘hospital’, ‘region’, ‘health’, and ‘Corona’. Compared with the result of the degree centrality analysis, the same keywords appear in the high ranking, indicating that these high-ranked keywords have the most influence in the entire network.

Discussion

In 2020, the world is having a chaotic period due to Corona-virus disease 2019 (COVID-19). COVID-19 is a contagious disease caused by severe acute respiratory syndrome coronavirus 2. Until now, it is understood that the virus can spread from an infected person to others through respiratory droplets and close contact with a person who has COVID-19. The incubation period for COVID-19 can vary from 1 to 14 days (4 to 7 days on average), and the symptoms of various respiratory infections such as fever, tiredness, cough, shortness of breath, and pneumonia may appear [7].

Unlike general medical treatment, the practice of dentistry has the daily, routine dental procedures that can transmit viruses through aerosols due to the nature of dental treatment. As it is necessary to directly examine the patient’s oral cavity, face-to-face treatment with the patient is essential in dentistry. Among the health care providers that are involved with aerosol production, dentists are at most risk of being exposed to the virus responsible for COVID-19. In addition, dentists and other dental practitioners are at increased risk of Corona viruses due to the nature of dental treatment that involves the close proximity to the patients’ oropharynx. The potential for unknown interaction with COVID-19 patients including asymptomatic patients is a great concern as well as it may cause cross-infection between dentists including other dental practitioners and patients, and between patients who visit the dentist [8,9].

Therefore, amidst the widespread social confusion caused by COVID-19, data including a number of news articles and Internet news on ‘dentistry’ posted for six months since the first COVID-19 case was found in January 2020 was collected through Big Kinds [10].

Through Big Kinds, keywords from a total of 54 media companies including Kyunghyang Daily News were collect-ed. The collected multi-sourced data was applied to the NetMiner program. This program enables you to analyze not only news articles but also unstructured text data such as interviews and speeches on social media. It can also provide information for text analysis including parts of speech and TF-IDF of the extracted words, enabling you to identify the main topics [11].

Therefore, in this study, a top-ranked keyword analysis and centrality analysis were conducted using the NetMiner pro-gram. Since the language network analysis forms a network through the connections between words, the relative position of individual words according to the connections between words can be calculated as a quantitative network index [12]. In addition, by identifying concepts in which various types of network centrality are high through network analysis, the NetMiner enables users to understand the intention and meaning of the entire text [13].

As a result of data keyword analysis, a total of 21,549 keywords were found in news articles for ‘dentistry’. Out of the top 100 high frequency keywords, a keyword set was selected based on the criteria that a keyword has more than 5 times of co-occurrences. Among the top 25 high frequency keywords, the most frequently mentioned word was ‘dentistry’, and articles on ‘hospital’, ‘medical’, ‘Corona’, and ‘patient’ accounted for a great proportion. With this fact, it can be assumed that there is a great concern about the treatment of patients in dental clinics due to COVID-19. As mentioned in the recent study by Lee and Jeon [14], the rate of decrease in patient volume and revenue due to the economic impact of COVID-19 on dental clinics is becoming a reality. In particular, it was found when the cumulative number of confirmed cases were greater, the years of the dentist’s experience were more, and the number of employees was smaller, the economic damage on the business due to COVID-19 tends to be greater [14].

Freeman [15] formalized three different measures of node centrality: degree centrality, closeness centrality, and betweenness centrality. Therefore, in this study, degree centrality, closeness centrality, and betweenness centrality were examined in order to examine the influence and relationship of the keywords in news articles.

First, degree centrality is a local centrality measure, which means the number of links held by each node [13]. This is a measure of how many connections keywords have in a net-work. It means that keywords with more connections have more autonomy and power due to the wider choices [6]. In other words, it shows which keywords have and how many key connections to dentistry.

As a result of the degree centrality analysis, it was found that in the top keyword group, ‘dentistry’ had the highest degree centrality, followed by ‘medical’, ‘hospital’, and ‘region’. Also, the degree centrality was high in the order of ‘health’, ‘Corona’, ‘patient’, ‘examination’, ‘support’, and ‘treatment’. This shows that the current social confusion caused by COVID-19 has emerged as a concern about the medical service of dental clinics. As evidenced from the study by Lee and Jeon [14], there has been a significant drop in patient volume and revenue in dental clinics due to the COVID-19.

Second, betweenness centrality is a measure of how much one keyword plays the role of intermediary with another keyword in building a network [6]. As a result of the betweenness centrality analysis, it was found that ‘dentistry’ had the highest betweenness centrality, followed by ‘medical’, ‘region’, ‘hospital’, ‘Corona’, ‘patient’, and ‘health’. Betweenness centrality is somewhat similar to degreee centrality, but there are keywords that show differences. Keywords with higher betweenness centrality would have more control over the flow of information within the network [6]. When comparing the betweenness centrality of the top group with the degree centrality of the top group, the keywords, ‘dentistry’ and ‘medical’ appeared in the same ranking, while ‘region’ and ‘Corona’ were ranked higher in betweenness centrality than degree centrality. As the study by Lee and Jeon [14] suggested, the greater the number of cumulative confirmed cases, the greater the economic damage in terms of patient volume and revenue. Thus, it can be seen that the economic damage to dental clinics caused by COVID-19 was significantly affected by the structural characteristics of the region.

Third, closeness centrality measures how close the keyword is located to the center of the network, and shows the proximity of the keywords connected in the network. Close-ness centrality indicates which keyword has the most common influence across the network [6]. As a result of closeness centrality analysis, it was found that in case of the top group, ‘dentistry’ had the highest closeness centrality, followed by ‘medical’, ‘hospital’, ‘local’, ‘health’, and ‘Corona’. Com-pared with the result of degree centrality analysis, the same keywords appear in the high ranking, indicating that these high-ranked keywords have the most influence in the entire network.

This study attempted to identify the issues of the first half of 2020 on ‘dentistry’ by analyzing the language network of dental-related news articles. Information was collected from a total of 54 media companies, including Kyunghyang Daily News, through Big Kinds, but there was a limit to the wider use of media. Moreover, since the collected data was limited to the news articles only for six months during the first half of 2020, the general data on ‘dentistry’ was insufficient. Further studies on the changes of dental industry in Korea that were caused by COVID-19 pandemic in 2020 are expected. It is judged that multi-faceted studies on the changes in Korea’s dental industry caused by COVID-19 in 2020 are required in the future.

Conclusion

This study collected information from a total of 54 media companies, including Kyunghyang Daily News, through Big Kinds from January 1 to June 30, 2020, using big data on ‘dentistry’ provided by social media. Using NetMiner, a keyword analysis was conducted to identify the language network of news articles on ‘dentistry’, and a centrality analysis was conducted to identify the keywords, and conclusions were obtained as follows.

1. As for the keywords used in the news articles for ‘dentistry’, the keyword set was selected based on the criteria that a keyword has more than 5 times of co-occurrences out of the top 100 high frequency keywords. Also, the most frequently mentioned word was ‘dentistry’, and articles on ‘hospital’, ‘medical’, ‘Corona’, and ‘patient’ accounted for a great proportion.

2. As a result of the degree centrality analysis, it was found that in the top keyword group, ‘dentistry’ had the highest degree centrality, followed by ‘medical’, ‘hospital’, and ‘region’. Also, the degree centrality was high in the order of ‘health’, ‘Corona’, ‘patient’, ‘examination’, ‘support’, and ‘treatment’. As a result of the betweenness centrality analysis, it was found that ‘dentistry’ had the highest betweenness centrality, followed by ‘medical’, ‘region’, ‘hospital’, ‘Corona’, ‘patient’, and ‘health’. As a result of closeness centrality analysis, it was found that ‘dentistry’ had the highest closeness centrality, followed by ‘medical’, ‘hospital’, ‘local’, ‘health’, and ‘Corona’.

In the news articles during the first half of 2020 that were extracted by Big Kinds in response to the search query: ‘dentistry’, the most frequently mentioned word was ‘den-tistry’, and articles on ‘hospital’, ‘medical’, ‘Corona’, and ‘pa-tient’ accounted for a great proportion, thus showing that the overall confusion in society caused by COVID-19 is affected by the structural characteristics of COVID-19 outbreak areas especially in the dental field as well. Further studies on the changes of dental industry in Korea that were caused by COVID-19 pandemic in 2020 are expected.

Conflict of Interest

No potential conflict of interest relevant to this article was reported.

References
  1. Cha YR, Kwon JS, Choi JH, Kim CY. The analysis of the current status of medical accidents and disputes researched in the Korean web sites. J Oral Med Pain. 2006 ; 31 : 297-316.
  2. Lee BK. A comparative analysis study of IFLA school library guidelines using semantic network analysis. J Korean Libr Inf Sci Soc. 2020 ; 51 : 1-21.
  3. Kim NI. The status and problem related to the old documents metadata standardization. J Soc Korean Hist Manuscr. 2009 ; 34 : 107-47.
  4. Kim HK, Kwon KS, Jang DH. Language network analysis of ‘marine environment’ in news frame. J Korea Contents Assoc. 2016 ; 16 : 385-98.
  5. Lee SJ, Chun YN. Examining public opinion on tourism using social media analytics: focuing on Gyeonggo-do. GRI Rev. 2016 ; 18 : 83-109.
  6. Lee SS. Network analysis methods. Seoul : Nonhyung. 2012.
  7. Central Disaster and Safety Countermeasures Headquarters [Internet]. Cheongju: MOHW [cited 2020 Nov 10].
    Available from: http://ncov.mohw.go.kr
  8. Odeh ND, Babkair H, Abu-Hammad S, Borzangy S, Abu- Hammad A, Abu-Hammad O. COVID-19: present and future challenges for dental practice. Int J Environ Res Public Health. 2020 ; 17 : 3151.
    Pubmed KoreaMed CrossRef
  9. Ather A, Patel B, Ruparel NB, Diogenes A, Hargreaves KM. Coronavirus disease 19 (COVID-19): implications for clinical dental care. J Endod. 2020 ; 46 : 584-95.
    Pubmed KoreaMed CrossRef
  10. Big kinds user manual [Internet]. Seoul: Korea Press Foundation [2020 Nov 10].
    Available from: https://www
  11. Text network analysis service [Internet]. Seongnam: Cyram [2020 Nov 10].
    Available from: http://www
  12. Park CS, Chung CW. Text network analysis: detecting shared meaning through socio-cognitive networks of policy stake-holders. J Gov Stud. 2013 ; 19 : 73-108.
  13. Lee SS. Network analysis methods applications and limitations. Seoul : Cheongram. 2018.
  14. Lee GY, Jeon JE. Factors affecting COVID-19 economic loss to dental institutions: application of multilevel analysis. J Korean Dent Assoc. 2020 ; 58 : 627-38.
  15. Freeman LC. Centrality in social networks conceptual clarifi-cation. Soc Netw. 1978-1979 ; 1 : 215-39.


December 2020, 16 (4)