Budget Amount *help |
¥41,340,000 (Direct Cost: ¥31,800,000、Indirect Cost: ¥9,540,000)
Fiscal Year 2009: ¥14,170,000 (Direct Cost: ¥10,900,000、Indirect Cost: ¥3,270,000)
Fiscal Year 2008: ¥12,740,000 (Direct Cost: ¥9,800,000、Indirect Cost: ¥2,940,000)
Fiscal Year 2007: ¥14,430,000 (Direct Cost: ¥11,100,000、Indirect Cost: ¥3,330,000)
|
Research Abstract |
The Web has grown to an indispensable social platform for global information flow, sharing and even co-creation. We carried out our research on a foundation of semantic computing toward a next-generation Web, which allows computers to understand the semantic meaning of Web information and thereby to work on the retrieval, mining, editing and organizing Web information considering not only the surface level of information but also the semantic level. Specifically, we worked focusing on CDL (Concept Description Language), which is a common language for representing concept meaning expressed in texts. CDL is a technology originated from Japan, and has been undergone the process of international standardization in W3C promoted by our group. Because CDL is not dependent on a particular language and has a universal property for all the natural languages, it can be called "computer Esperanto language" and may also contribute to overcome the language barrier problem in the world. The current f
… More
irst issue on CDL is the conversion from natural language texts to the CDL representation. Like in machine translation, its full automatic conversion is impossible in near future. Thus we have studied and developed a semi-automatic conversion method passing through the process of word sense disambiguation, where a human writer/operator selects the correct word sense when the computer cannot determine the correct one. For another issue regarding CDL, we have developed a semantic retrieval method for CDL data, which achieves a semantic-level matching efficiently. The base of the CDL representation is to connect two word entities in a sentence with one of predefined relational labels. As its related research, we have devised an efficient method for computing the similarity between two word entities based on the distributional hypothesis through the use of a search engine. This research was highly evaluated internationally such that it was accepted as a full paper at WWW2009 (the most prestigious conference in the Web technology field). Based on this research result, we have developed a new type search engine named latent relational search engine, which accepts two entity pairs with one blank element, e.g., {(Tokyo, Japan), (?, France)}, as a query, and produces an answer, e.g., {?=Paris} in this case. The mechanism of this latent relational search has been applied for patent. In addition, as an extension of this relational search, we have invented an efficient co-clustering method for mining typical and meaningful entity pairs from lots of entity pair candidates extracted from Web texts ; this framework is called open information extraction. This research was also accepted as a full paper at WWW2010, and has been also submitted for a patent application. Less
|