Information Retrieval and Extraction Exercise
(IR & IE Contest for Japanese language)
Last modification, February 14, 1999
An IR (Information Retrieval) and IE (Information Extraction)
contest for Japanese language is planned to be held.
The first contest will be "grass roots" style;
international participants are welcome provided that people
share the following objectives:
To contribute to the improvement of the field
To widen the research area and circle of researchers
To increase the amount of corpus and database
To promote this kind of effort in the future
There is no fee to participate, although if you participate IR,
you have to buy a newspaper corpus, "Mainichi Shinbun (94,95)"
which is sold by a company, "Nichigai-associates"
(about $1,000 each year).
For NE participants, thaks to Mainichi Shinbun,
they will provide us the test articles for free.
Also, most of the information (definitions, etc) will be distributed
in Japanese, please don't expect to translate everything
to other languages.
(Sekine (firstname.lastname@example.org) will privately assist you in some cases.)
Note that this is not a contest using the NACSIS collection.
Two tasks are planning to be held.
Anyone can participate one or both tasks.
NE: Named Entity Task
Basically, it is similar to the MUC-NE or MET task.
There are minor differences, like "artifact" which includes
product names, names of services, etc is added.
Also, there will be two kinds of test: one is for general domain
texts (60-70 articles), and the other is for specific domain texts
The domain will be announced about two weeks before the
IR: Information Retrieval
Basically, it is similar to the TREC adhoc task.
The target is to retrieve about 300 relevant documents
from two years of newspaper articles.
There will be about 30 topics.
These tasks are designed for technology evaluations,
rather than commercial purposes.
For example, many people discussed that interface is an important
issue in IR, etc, however, these issues should be addressed
in the future IREX.
- June 30, 1998 : Distribute draft version of definitions, sample data
- July 31, 1998 : Preliminary application due (this is not a hard deadline)
- September 16, 1998 : Close the discussion for the definitions
- == Dry-run ==
- November 9 ,1998 : IR topics distribution
- November 16, 1998 : IR result due
- November 17, 1998 : NE text distribution
- November 20, 1998 : NE result due
- February 28, 1999 : Application due
- == Formal-run ==
- April 5, 1999 : Distribute IR queries
- April 12, 1999 : IR result due (JST 23:59)
- April 13, 1999 : Freeze NE system development
- May 13, 1999 : Distribute NE tasks
- May 17, 1999 : NE result due (JST 23:59)
- September, 1999 : Workshop (planned, in Tokyo)
More Information (in Japanese - EUC)
- If you would like to join the mailing list, please send e-mail to sekine (email@example.com).
There are several non-native people, in the mailing list.
We try to write some English so that these people can find what kind of topic is
being discussed in the message.
Please send a signed registration form
to Dr.Isahara (CRL) by March 15, 1999.
(This is not a hard deadline, please contact firstname.lastname@example.org,
if you have your individual concern.)
Address is shown in the form.
Organized by : IREX Committee
Mailing list : email@example.com
Co-chair : S.Sekine (NYU), H.Isahara (CRL)
Advisor : M.Nagao (Kyoto-U), H.Tanaka (TITech), R.Grishman (NYU), T.Ishikawa (ULIS), D.Harmon (NIST) H.Iida (SONY)
Committee Member : T.Tokunaga (TITech), S.Kurohashi (Kyoto-U), M.Okumura (JAIST), C.Nobata (U-Tokyo),
K.Kita (Tokushima-U), K.Inui (KIT), Y.Nakagawa (YNU), A.Fujii (ULIS), T.Wakao (TAO),
N.Kando (NACSIS), K.Hashida (ETL), E.Sumita (ATR), M.Murata, K.Uchimoto (CRL),
N.Noguchi (Matsushita), A.Okumura, S.Fukushima (NEC), Y.Ogawa (RICOH), T.Sakai (Toshiba),
J.Fukumoto (Oki), T.Kitani, Y.Eriguchi (NTT Data), S.Nakawatase (NTT), J.Tomiura (Mitsubishi),
R.Ochitani (Fujitsu), S.Ogino (IBM)