정보관리학회지, 한국정보관리학회

1

디지털도서관 구축과정에서 TREC 텍스트 문서의 시각적 표현에 관한 연구

정기태(Assistant Professor University of Oklahoma School of Library and Information Studies) ; 박일종(계명대학교) 2004, Vol.21, No.3, pp.1-14 https://doi.org/10.3743/KOSIM.2004.21.3.001

초록보기

초록

이용자들은 유사문서를 검색할 때, 각 가지 문서의 시각적표현을 통하여 도움을 얻게 되며, 모든 정보검색에 관한 연구는 이용자들의 다양한 요구를 충족시키기 위한 여러 가지의 해결책을 제시하고 있다. 제안되어진 해결책은 알파벳 순서로 만들어 진 파피루스 문서로부터 카드목록, 마이크로 필름을 이용한 저장, 컴퓨터 디스크를 이용한 파일 보관 등에 이르기까지 다양한 방법들을 들 수 있을 것이다. 또한 대부분의 정보검색 시스템들은 Document Surrogate(문헌을 대체할 수 있는 것들), 즉 요약문, 목차, 초록, 리뷰한 내용, 기계가독형목록(MARC) 기록물 등과 같은 서지자료들을 전체논문을 대체하여 이용하게 된다.본 논문에서는 또 다른 형태의 Document Surrogate로서 용어 리스트의 집단화 방법을 이용해서 찾아보았다. 이 Document Surrogate들은 Multidimensional Scaling (MDS)을 이용해서 2차원 그래프 위에 좌표로써 표현되어지고 있다. 사용된 2차원의그래프 위에서 좌표간의 거리는 문헌들의 유사성을 나타낸다고 해석할 수 있으며 거리가 가까우면 가까울수록 두 문서는 더욱 유사한내용을 포함하고 있다고 해석할 수 있는 것으로 밝혀졌다.

Abstract

Visualization of documents will help users when they do search similar documents, and all research in information retrieval addresses itself to the problem of a user with an information need facing a data source containing an acceptable solution to that need. In various contexts, adequate solutions to this problem have included alphabetized cubbyholes housing papyrus rolls, microfilm registers, card catalogs and inverted files coded onto discs. Many information retrieval systems rely on the use of a document surrogate. Though they might be surprise to discover it, nearly every information seeker uses an array of document surrogates. Summaries, tables of contents, abstracts, reviews, and MARC recordsthese are all document surrogates. That is, they stand infor a document allowing a user to make some decision regarding it, whether to retrieve a book from the stacks, whether to read an entire article, etc.In this paper another type of document surrogate is investigated using a grouping method of term list. Using Multidimensional Scaling Method (MDS) those surrogates are visualized on two-dimensional graph. The distances between dots on the two-dimensional graph can be represented as the similarity of the documents. More close the distance, more similar the documents.

2

자연어 질의 분석과 검색어 확장에 기반한 웹 정보 검색

윤성희(상명대학교) 2004, Vol.21, No.2, pp.235-248 https://doi.org/10.3743/KOSIM.2004.21.2.235

초록보기

초록

웹 문서 검색을 위해 키워드와 불리언 연산식을 사용하는 것에 비해 자연어 질의 문장을 입력하는 방법은 검색 시스템 사용자에게 훨씬 이상적인 인터페이스이다. 본 논문은 사용자가 입력하는 자연어 질의 문장을 구문 분석하고 그 구문 구조에 기반하여 검색어를 확장하는 다중 검색 기법을 제안한다. 구문 트리를 순회하여 구조적으로 연관된 복합 명사를 조합하거나 분할하는 과정을 거치고, 이형 표기 및 축약 표기 용어들에 대해 확장 다중 검색함으로써 웹 정보 검색 시스템의 재현율과 정확도를 높일 수 있다.

Abstract

For the users of information retrieval systems, natural language query is the more ideal interface, compared with keyword and boolean expressions. This paper proposes a retrieval technique with expanded keyword from syntactically-analyzed structures of natural language query as user input. Through the steps combining or splitting the compound nouns based on syntactic tree traversal of the query, and expanding the other-formed or shorten-formed into multiple keyword, it can enhance the precision and correctness of the retrieval system.

3

K-Means 알고리즘을 이용한 계층적 클러스터링에서 클러스터 계층 깊이와 초기값 선정

이신원(중원대학교) ; 안동언(전북대학교) ; 정성종(전북대학교) 2004, Vol.21, No.4, pp.173-185 https://doi.org/10.3743/KOSIM.2004.21.4.173

초록보기

초록

정보통신의 기술이 발달하면서 정보의 양이 많아지고 사용자의 질의에 대한 검색 결과 리스트도 많이 추출되므로 빠르고 고품질의 문서 클러스터링 알고리즘이 중요한 역할을 하고 있다. 많은 논문들이 계층적 클러스터링 방법을 이용하여 좋은 성능을 보이지만 시간이 많이 소요된다. 반면 K-means 알고리즘은 시간 복잡도를 줄일 수 있는 방법이다. 본 논문에서는 계층적 클러스터링 시스템인 콘도르(Condor) 시스템에서 간단하고 고품질이며 효율적으로 정보 검색 할 수 있도록 구현하였다. 이 시스템은 K-Means Algorithm을 이용하였으며 클러스터 계층 깊이와 초기값을 조절하여 88%의 정확율을 보였다.

Abstract

Fast and high-quality document clustering algorithms play an important role in providing data exploration by organizing large amounts of information into a small number of meaningful clusters. Many papers have shown that the hierarchical clustering method takes good-performance, but is limited because of its quadratic time complexity. In contrast, with a large number of variables, K-means has a time complexity that is linear in the number of documents, but is thought to produce inferior clusters. In this paper, Condor system using K-Means algorithm Compares with regular method that the initial centroids have been established in advance, our method performance has been improved a lot.

4

해양전문정보센터의 멀티미디어 메타데이터베이스 및 디지털도서관 통합정보시스템 구현에 관한 연구

한종엽(한국해양연구원) ; 최영준(㈜킨스 e사업본부) 2004, Vol.21, No.4, pp.5-26 https://doi.org/10.3743/KOSIM.2004.21.4.005

초록보기

초록

본 연구는 국내 해양전문정보센터에서 효율적인 정보서비스를 위해 필요한 멀티미디어 메타데이터베이스와 디지털도서관 통합정보시스템을 구현할 목적으로 선행연구를 조사하고 분석하였다. 연구대상자원은 해양분야의 인쇄매체, 네트워크자원, 원문화일, 동영상 등을 범위로 하였다. 본 연구에서는 인쇄매체를 포함한 각종 멀티미디어 컨텐츠 자원의 기술과 조직을 위해 LC표준으로 사용하고 있는 MODS를 기반으로 하여 통합정보검색서비스를 제공하고자 하였다. 이를 위해 본 연구에서는 해양분야 각종 정보자원 조사, 멀티미디어 정보처리, MODS 등 메타데이터 기술요소 분석, 메타데이터 분류체계, 시스템 구성 및 검색 구현방안의 연구를 수행하였다.

Abstract

A literature analysis for the planning and realization of the multimedia meta database and digital library's integrated information system was carried out to establish the various oceanographic resources in the Oceanographic Information Center, the first in Korea. The study targeted from printed matter, network resources, full-text and to VOD. The focus of the analysis lies in the providing practical integrated information retrieval service for oceanographic resources based on the framework of effective MODS metadata with network resources description. The analyses included oceanographic resources, multimedia information processing, MODS metadata descriptive elements, metadata classification, system organization, and retrieval for planning and implementation of the multimedia meta database system.

5

웹 정보탐색행위 모형의 비교 분석 연구

김성진(인하공업전문대학) 2004, Vol.21, No.2, pp.211-233 https://doi.org/10.3743/KOSIM.2004.21.2.211

초록보기

초록

웹은 지금까지 연구되어온 전통적인 정보검색 시스템과는 차별되는 새로운 정보환경이므로 웹상에서 발생하는 이용자와 정보검색 시스템 간의 상호작용에 대한 이해를 위해 새로운 관점에서의 연구가 충분히 이루어져야 하며 이러한 연구를 뒷받침해줄 웹 기반 정보탐색 패러다임이 정착될 필요가 있다. 이러한 맥락에서 본 연구는 웹 정보탐색행위를 연구한 문헌에서 제시된 이론적 모형들을 검토하고 비교 분석하였다. Wang, Hawk, Tenopir, Hsieh-Yee, Choo, Detlor, Turnbul, Chun과 Cooper, Rieh, Spink의 연구에서 제시된 모형들이 논의되었다. 분석 결과, 웹 정보탐색 모형은 크게 상호작용 모형, 정보탐색행위 모형, 평가 모형으로 구분되며, 전통적인 정보탐색과정 모형에 비해 복합요인들의 상호작용과 정보탐색행위의 비선형적 관점이 강조되었다는 특징을 갖는다.

Abstract

The web is a new information environment, which has different characteristics from a traditional IR environment. Needed are more research from a new point of view as well as the adoption of a new research paradigm in order to understand a user-system interaction on the web. The purpose of this study is to review and analyze models of web-based information seeking behavior, which Wang, Hawk & Tenopir, Hsieh-Yee, Choo, Detlor & Turnbull, Chun & Cooper, Rieh, and Spink proposed. The comparative analysis indicates that web-based information seeking models are categorized into three area: interaction model, information seeking behavior model, and evaluation model, and that they are based on a multifaceted interaction and a nonlinear perspective.

6

온톨로지를 이용한 의학용어의 개념 모델링 사례 분석 연구

이현실(원광대학교) 2004, Vol.21, No.3, pp.141-160 https://doi.org/10.3743/KOSIM.2004.21.3.141

초록보기

초록

최근 의학정보 분야에서는 임상의 지식관리와 의학정보 검색의 효율화를 위한 수단으로 온톨로지의 개념 모델링을 이용한 의학용어 시스템에 관심이 모아지고 있다. 본 연구는 우리나라의 의학정보 분야에 이러한 시스템의 응용이나 새로운 시스템 개발에 기초적인 자료제공을 목적으로, 정보 모델링과 온톨로지의 이론에 대해 고찰하였고, 외국의 의학정보 분야에서 온톨로지를 이용한 용어 시스템이 개발된 4가지 대표적인 사례를 분석하여 비교하였다. 연구결과 비형식적인 수준의 온톨로지로 파악된 MeSH의 의학용어 표준화와 UMLS의 용어 개념화, 형식적인 수준의 온톨로지인 ON9의 의학 온톨로지 통합의 이론화, 그리고 GALEN의 의학지식의 의미 모델과 형식화로 핵심적 특징을 요약할 수 있었다. 온톨로지의 응용은 목적하는 시스템에 따른 수준적 차별화가 이루어져야 할 것이고, 본 연구의 분석 결과가 참고 될 수 있을 것이다.

Abstract

Recent research in the field of medical information systems has paid much attention to an ontology based medical terminology system to support clinical study and effective information search. This study aims to conduct research for further application or construction of ontology systems in Korea. This research reviews the theory of concept modeling and ontology, and analyses 4 cases of conceptual modeling of medical terminologies by ontology. The findings of this study display these specific characteristics in medical ontologies : (1) The standardization of terminology on MeSH. (2) The conceptualization of terminology on UMLS. (1) and (2) are showed as unformal ontologies. (3) The theory of ontology integration in ON9. (4) The reference model of medical knowledge with formalization in GALEN. (3) and (4) are showed as formal ontologies. The application and construction of ontology should be differentiated according to the level of the proposed system, and then this analysis will provide useful information for the researcher and developer of the system.

7

객체-관계형 데이터베이스에 의한 XML문헌의 검색성능 평가

김희섭(경북대학교) 2004, Vol.21, No.2, pp.189-210 https://doi.org/10.3743/KOSIM.2004.21.2.189

초록보기

초록

본 연구의 목적은 객체-관계형 데이터베이스 접근에 의한 XML 문헌의 검색 성능을 평가하는 것이다. 본 논문에서는 INEX(Initiative for the Evaluation of XML retrieval)에서의 XML 문헌의 색인 및 검색 방법에 대하여, 그리고 실험 방법론들에 대하여 기술하고 있다. 대부분의 전통적인 정보검색 성능평가 실험에서와 같이 본 연구에서 사용된 테스트 콜렉션(test collection)은 문헌(즉, XML 문헌), 토픽, ad hoc 검색, 적합성 판단, 평가로 이루어졌다. 그리고 ORDBMS 기술들을 기반으로 개발된 전용 XML 데이터베이스의 일종인 EXIMATM Supply을 사용하여 INEX에서 제공한 대규모 XML 문헌들을 저장하고 검색하였다. 본 논문에서는 실험에서 사용한 시스템에 대한 개략적인 기능들과 색인 및 검색 과정 그리고 INEX 2002에서의 성능평가 결과에 대하여, 앞으로 개선되어야 할 기능에 대하여 논하고 있다.

Abstract

The purpose of this study is to evaluate the performance of XML retrieval based on ORDBMSs(Object-Relational Database Management Systems) approach. This paper describes indexing and retrieval methods for XML documents and the methodologies of experiments at INEX(Initiative for the Evaluation of XML retrieval). Like any other traditional information retrieval experiment, the test collection was consists of documents, topics/queries, task, relevance assessments and evaluation. EXIMATM Supply, a kind of native XML DB based on ORDBMS technologies, is used for this experiment. Although this approach has many benefits, for example, no delay in storing and searching XML documents, but it showed relatively disappointed retrieval performance at INEX 2002. This result may caused since the given topics had to be decomposed and modified to be processed by the XPath processor, and during this modification the original meaning of topics can be changed inevitably and some important information may pass over.

8

의학 분야 웹 자료의 분류에 대한 개선 방안 연구

정경희(한성대학교) 2004, Vol.21, No.2, pp.89-106 https://doi.org/10.3743/KOSIM.2004.21.2.089

초록보기

초록

웹 상의 의학 분야 자료들은 방대한 규모로 존재하며, 각 검색엔진에서는 이를 분류하여 제공하고 있으나 그 구성에 있어서 일관성과 체계성이 부족하다. 따라서 본 논문은 검색엔진에서 의학 분야 웹 자료 분류체계를 구성하기 위하여 의학 전문 문헌분류표인 NLMC를 준용하고, 항목의 배열이 주제간 관련성을 기반으로 이루어져야 한다는 것을 제안하였다. 또한 순환성을 고려한 1차 분류 및 2차 분류 항목에서의 중복 분류시, 그에 대한 명확한 기준이 설정되어야 하며, 분류 항목명을 의학 분야 표준 용어집인 MeSH와 의학용어집의 용어로 선택하여 기존의 도서관 정보검색시스템과의 상호호환성을 높여야 한다는 것을 제안하였다.

Abstract

There are lots of Web materials in the field of medicine and many search engines classify the medical materials on the Web through directories. But the organization of these directories are wanting in consistency and systematization. In order for manager of search engines to organize medical materials on the Web systematically, this paper suggests several guidelines. NLMC, a special classification system for medicine, need to be applied to develop directories of medicine in search engines. Also, items of the directories should be arranged based on the relevance of subjects among subfields of medical science. For classifying an item to several directories repeatedly, clear criteria should be established. In addition to, controlled vocabularies or glossaries for medicine such as MeSH and the English-Korean, Korean-English Medical Terminology Collection should be used for selection of the name of items in medical directories.

바로가기메뉴

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

정보관리학회지