Chen Li's Publications
Refereed Conference Full Papers
1.
AsterixDB: A Scalable, Open Source BDMS,
Sattam Alsubaiee, Yasser Altowim, Hotham Altwaijry, Alexander Behm, Vinayak R. Borkar, Yingyi Bu, Michael J. Carey, Inci
Cetindil, Madhusudan Cheelangi, Khurram Faraaz, Eugenia Gabrielova, Raman
Grover, Zachary Heilbron, Young-Seok
Kim, Chen Li, Guangqiang Li, Ji
Mahn Ok, Nicola Onose, Pouria Pirzadeh, Vassilis J. Tsotras, Rares Vernica, Jian Wen, Till Westmann:. PVLDB 7(14): 1905-1916 (2014)
2.
Storage
Management in AsterixDB,
Sattam Alsubaiee, Alexander
Behm, Vinayak R. Borkar, Zachary Heilbron, Young-Seok Kim, Michael J. Carey, Markus Dreseler,
Chen Li, PVLDB 7(10): 841-852 (2014)
3.
Efficient
instant-fuzzy search with proximity ranking, Inci
Cetindil, Jamshid Esmaelnezhad, Taewoo Kim, Chen
Li. ICDE 2014: 328-339
4.
Efficient
direct search on compressed genomic data, Xiaochun
Yang, Bin Wang, Chen Li, Jiaying Wang, Xiaohui Xie, ICDE 2013: 961-972
5.
Improving
regular-expression matching on strings using negative factors,
Xiaochun Yang, Bin Wang, Tao Qiu,
Yaoshu Wang, Chen Li, SIGMOD Conference 2013: 361-372
6.
String
similarity measures and joins with synonyms, Jiaheng
Lu, Chunbin Lin, Wei Wang, Chen Li, Haiyong Wang, SIGMOD Conference 2013: 373-384
7.
Supporting
Efficient Top-k Queries in Type-Ahead Search, Guoliang Li, Jiannan Wang,
Chen Li, Jianhua Feng, SIGIR 2012. [PDF], [PPTX], [Demo]
- Inside “Big Data Management”: Ogres, Onions, or Parfaits?
Vinayak Borkar, Michael J. Carey, and Chen Li, EDBT 2012. [PDF]
- Location-Based Instant Search, Shengyue Ji, Chen Li, SSDBM
2011: 17-36. [PDF]
- CHIME: An Efficient Error-Tolerant Chinese Pinyin Input Method, Yabin
Zheng, Chen Li, Maosong Sun, IJCAI
2011, 2551-2556. [PDF], [Demo]
- Answering Approximate String Queries on Large Data Sets Using
External Memory, Alexander Behm, Chen Li, and Michael Carey, ICDE
2011. [PDF] [Source Code]
- Supporting Location-Based Approximate-Keyword Queries, Sattam
Alsubaiee, Alexander Behm, and Chen Li, ACM GIS 2010. [PDF]
[PPT]
[Source Code
and Demos]
- Hybrid Indexing and Seamless Ranking of Spatial and Textual Features
of Web Documents, Ali Khodaei, Cyrus Shahabi, Chen Li, DEXA 2010. [PDF]
- Efficient Parallel Set-Similarity Joins Using MapReduce. Rares
Vernica, Michael J. Carey, Chen Li, SIGMOD 2010, [PDF], [ source code]
- Type-Ahead Search on Relational Data: a TASTIER Approach,
Guoliang Li, Shengyue Ji, Chen Li, and Jianhua Feng, SIGMOD 2009. [PDF], [PPTX].
- Efficient Interactive Fuzzy Keyword Search, Shengyue Ji,
Guoliang Li, Chen Li, and Jianhua Feng, WWW 2009. [PDF], [PPTX]
- Best-Effort Top-k Query Processing Under Budgetary Constraints,
Michal Shmueli-Scheuer, Chen Li, Yosi Mass, Haggai Roitman, Ralf Schenkel, and Gerhard
Weikum, ICDE 2009. [PDF], [PPT]
- Space-Constrained Gram-Based Indexing for Efficient Approximate
String Search, Alexander Behm, Shengyue Ji, Chen Li, and Jiaheng Lu,
ICDE 2009. [PDF], [PPTX]
- Cost-Based Variable-Length-Gram Selection for String Collections
to Support Approximate Queries Efficiently, Xiaochun Yang, Bin Wang,
and Chen Li, ACM SIGMOD 2008. [PDF], [PPT]
- Efficient Merging and Filtering Algorithms for Approximate String
Searches, Chen Li, Jiaheng Lu, and Yiming
Lu. ICDE 2008. [PDF], [PPT], [Source Code].
- Data Exchange with Arithmetic Comparisons, Foto
Afrati, Chen Li, and Vassia Pavlaki. EDBT 2008.
[PDF]
- VGRAM: Improving Performance of Approximate Queries on String
Collections Using Variable-Length Grams, Chen Li, Bin Wang, and
Xiaochun Yang. VLDB 2007. [PDF], [PPT]
- Processing Spatial-Keyword (SK) Queries in Geographic Information
Retrieval (GIR) Systems, Ramaswamy Hariharan, Bijit Hore, Chen Li,
Sharad Mehrotra, SSDBM 2007. [PDF]
- Protecting Individual Information Against Inference Attacks in
Data Publishing, Chen Li, Houtan Shirani-Mehr,
and Xiaochun Yang. DASFAA 2007. [PDF]
- Supporting Approximate
Similarity Queries with Quality Guarantees in P2P Systems, Qi Zhong, Iosif Lazaridis, Mayur Deshpande, Chen Li, Sharad Mehrotra, Hal Stern, COMAD 2006,
December 14-16, 2006, Delhi, India. [PDF]
- Relaxing Join and Selection Queries. Nick Koudas, Chen Li,
Anthony Tung, and Rares Vernica. VLDB 2006, Seoul,
Korea,
2006. (13.2% accepted) [PDF], [PPT], [Source Code]
- Selectivity Estimation for
Fuzzy String Predicates in Large Data Sets, Liang Jin and Chen Li.
VLDB 2005, Trondheim, Norway,
August 30 - September 2, 2005. (16% accepted) [PDF], [PPT],
[Source Code].
- Indexing Mixed Types for
Approximate Retrieval, Liang Jin, Nick Koudas,
Chen Li, Anthony K.H. Tung.VLDB 2005, Trondheim,
Norway,
August 30 - September 2, 2005. (16% accepted) [PDF], [PPT],
[Source Code].
- Secure XML Publishing
without Information Leakage in the Presence of Data Inference.
Xiaochun Yang and Chen Li. VLDB, Toronto,
Canada,
August 29 - September 3, 2004. [PDF], [PPT]. (16% accepted)
- NNH: Improving Performance
of Nearest-Neighbor Searches Using Histograms. Liang Jin, Nick Koudas,
Chen Li. EDBT, Crete, Greece,
March 2004. (14% accepted) [PDF],
[Full version], [PPT]
- On Containment of
Conjunctive Queries with Arithmetic Comparisons. Foto
Afrati, Chen Li, Prasenjit Mitra.
EDBT, Crete, Greece,
March 2004. (14% accepted) [PDF].
- Materializing Views with
Minimal Size to Answer Queries. Rada Chirkova and Chen Li. ACM PODS, June 2003, San
Diego, CA. (20% accepted). [PDF], [PPT]
- Efficient Record Linkage
in Large Data Sets, Liang Jin, Chen Li, and Sharad Mehrotra, in the
8th International Conference on Database Systems for Advanced Applications
(DASFAA 2003) 26 - 28 March, 2003, Kyoto, Japan. (33% accepted) [PS], [PDF], [PPT], [Source Code]. Received
DASFAA 2013 10-year Best Paper Award.
- Executing SQL over
Encrypted Data in the Database-Service-Provider Model. Hakan Hacigumus, Bala Iyer, Chen Li, and
Sharad Mehrotra. In ACM SIGMOD, June 3-6, 2002 Madison,
Wisconsin. (18%
accepted). Received SIGMOD 2012
10-year Test-of-Time Award. [PDF]
- Answering Queries Using
Views with Arithmetic Comparisons. Foto
Afrati, Chen Li, and Prasenjit Mitra. In ACM Symposium on Principles of Database
Systems (PODS), June 3-6, 2002 Madison,
Wisconsin. (22%
accepted)
- Generating Efficient Plans
for Queries Using Views. Foto Afrati, Chen
Li, and Jeff Ullman. In the Proc. of the 30th ACM SIGMOD Conference, Santa
Barbara, CA,
May, 2001. (15% accepted) [PS] [PDF] [PPT]
- Minimizing View Sets
without Losing Query-Answering Power. Chen Li, Mayank
Bawa, and Jeff Ullman. In the 8th International
Conference on Database Theory (ICDT), London,
UK,
January, 2001. [PS] [PDF], [PPT]. Full version: [PS] [PDF]. (35% accepted)
- On Answering Queries in
the Presence of Limited Access Patterns. Chen Li and Edward Chang. In
the 8th International Conference on Database Theory (ICDT), London,
UK,
January, 2001. [PS] [PDF] [PPT].
(35% accepted)
- Query Planning with
Limited Source Capabilities. Chen Li and Edward Chang.
International Conference on Database Engineering (ICDE), pages 401-412,
San Diego, CA, February, 2000. (14% accepted) [PS] [PDF]
[PPT]. Full version: [PS] [PDF]
- Computing Capabilities of
Mediators. Ramana Yerneni,
Chen Li, Hector Garcia-Molina, Jeffrey Ullman. SIGMOD'99, Philadelphia,
PA, May 1999. (20%
accepted) [PS] [PDF]. Full version: [PS] [PDF]
- Optimizing Large Join
Queries in Mediation Systems. Ramana Yerneni, Chen Li, Jeffrey Ullman, Hector
Garcia-Molina. International Conference on Database Theory (ICDT), Jerusalem,
Israel,
January, 1999. (29% accepted) [PS]
[PDF]. Full version: [PS] [PDF]
- Searching Near-Replicas of
Images via Clustering. Edward Chang, Chen Li, James Wang, Peter Mork, and Gio Wiederhold. Proc. of SPIE Symposium of Voice, Video,
and Data Communications, Multimedia Storage and Archiving Systems VI,
pages 281-292, Boston, MA, September, 1999. [PS]
[PDF]
- RIME: A Replicated Image
Detector for the World-Wide Web. Edward Chang, James Ze Wang, Chen Li, and Gio Wiederhold. Proceedings of SPIE Symposium of Voice,
Video, and Data Communications, pages 58--67, Boston, MA, November 1998. [PS] [PDF]
- 2D BubbleUp: Managing Parallel Disks for
Media Servers.
Edward Chang, Hector Garcia-Molina, and Chen Li. The 5th
International Conference of Foundations of Data Organization (FODO), pages
221-230, Kobe, Japan,
1998. [PS] [PDF]
- Performance Analysis of
the Communication Mechanism for POE Workstation Cluster. Weiqiang Zhuang, Chen Li,
Meiming Shen. Microcomputer & Micro-system, Jan, 1995
Refereed Journal Articles
- Hobbes: optimized gram-based methods
for efficient read alignment, Athena Ahmadi, Alexander Behm, Nagesh Honnalli, Chen Li, Lingjie Weng, and Xiaohui
Xie, Nucleic Acids Research 2011; doi: 10.1093/nar/gkr1246. [PDF]
- SKIF-P: a point-based indexing and
ranking of web documents for spatial-keyword search, Ali Khodaei,
Cyrus Shahabi, and Chen Li, Geoinformatica,
Springer, 2011. [PDF]
- Supporting BioMedical Information
Retrieval: The BioTracer Approach, Heri Ramampiaro and Chen Li, In
Transactions on Large-Scale Data- and Knowledge-Centered Systems (TLDKS),
2011, No.4. Vol. 6990, Springer. pp. 73–94. [PDF]
- ASTERIX: towards a scalable,
semistructured data platform for evolving-world models. Alexander
Behm, Vinayak R. Borkar, Michael J. Carey, Raman Grover, Chen Li, Nicola
Onose, Rares Vernica, Alin Deutsch, Yannis Papakonstantinou, Vassilis J.
Tsotras, Distributed and Parallel Databases, 2011, 29(3), 185-216. [PDF]
- Efficient fuzzy full-text type-ahead
search, Guoliang Li, Shengyue Ji, Chen Li, Jianhua Feng:. VLDB J.
20(4): 617-640 (2011). [PDF]
- Interactive and Fuzzy Search: A
Dynamic Way to Explore MEDLINE, Jiannan Wang, Inci Cetindil,
ShengyueJi, Chen Li, Xiaohui Xie, Guoliang Li, Jianhua Feng, Journal of
Bioinformatics, 2010. [PDF]
- Rewriting Queries using Views,
Chen Li: Encyclopedia of Database Systems 2009: 2438-2441. [PDF]
- SAIL: Structure-aware indexing for
effective and progressive top-k keyword search over XML documents, Guoliang
Li, Chen Li, Jianhua Feng, Lizhu Zhou: Inf. Sci.
179(21): 3745-3762 (2009). [PDF]
- Human genomes as email attachments.
Scott Christley, Yiming Lu, Chen Li, and Xiaohui
Xie, Bioinformatics 25: 274-275 (2009). [PDF]. [Source Code]. It was the most downloaded article on the Web site of the Journal
of Bioinformatics for two months.
- SEPIA: Estimating Selectivities of
Approximate String Predicates in Large Databases. Liang Jin, Chen Li,
and Rares Vernica. VLDB Journal,
Volume 17, Number 5, pages 1213-1229, August 2008. [PDF]
- Using Views to Generate Efficient Evaluation
Plans for Queries Foto Afrati, Chen Li, and Jeff Ullman, Journal of
Computer and System Sciences, Volume 73, Issue 5, pages 703-724, August 2007. [PDF]
- Rewriting Queries Using Views in the Presence of Arithmetic
Comparisons, Foto Afrati, Chen Li,
and Prasenjit Mitra,
Theoretical Computer Science, Volume 368, Numbers 1-2, pages 88-123, 2006.
[PDF]
- Supporting Efficient
Record Linkage for Large Data Sets Using Mapping Techniques, Chen Li,
Liang Jin, and Sharad Mehrotra, World Wide Web Journal, Volume 9, Number
4, pages 557-584, December 2006. [PDF]
- Achieving Communication
Efficiency through Push-Pull Partitioning of Semantic Spaces to
Disseminate Dynamic Information, Amitabha Bagchi, Amitabh Chaudhary, Michael T. Goodrich, Chen
Li, and Michal Shmueli-Scheuer. IEEE Transaction on Knowledge and Data
Engineering (TKDE), October 2006 (Vol. 18, No. 10). [PDF]
- Answering Queries Using
Materialized Views with Minimum Size. Rada Chirkova, Chen Li, and Jia
Li. VLDB Journal (2006), Volume 15, Number 3, 191-210. [PDF]
- Recent Progress on
Selected Topics on Database Research -- A Report from Nine Young Chinese
Researchers Working in the United
States. Zhiyuan Chen, Chen Li, Jian
Pei, Yufei Tao, Haixun
Wang, Wei Wang, Jiong Yang, Jun Yang, and Donghui Zhang. The Journal of Computer Science and
Technology. Vol. 18, No. 5, Pages 538 - 552, September 2003. [PDF]
- Computing Complete Answers
to Queries in the Presence of Limited Access Patterns. Chen Li. The
VLDB Journal (2003) 12: 211-227 [PS] [PDF]
- Answering Queries with
Useful Bindings. Chen Li and Edward Chang. ACM Transactions on
Database Systems (TODS), Volume 26 , Issue 3 (September 2001).[PS] [PDF]
- Clustering for Approximate
Similarity Search in High-Dimensional Spaces. Chen Li, Edward Chang,
Hector Garcia-Molina, and Gio Wiederhold. IEEE Transaction on Knowledge and Data
Engineering, Volume 14, Number 4, pp.792-808, July/August 2002 [PS] [PDF]
Refereed Workshop, Conference
Demo Papers, Tutorials, and Other
Publications
1.
ASTERIX:
An Open Source System for "Big Data" Management and Analysis,
Sattam Alsubaiee, Yasser Altowim, Hotham Altwaijry, Alexander Behm, Vinayak R. Borkar, Yingyi Bu, Michael J. Carey, Raman Grover, Zachary Heilbron, Young-Seok Kim, Chen
Li, Nicola Onose, Pouria Pirzadeh, Rares Vernica, Jian Wen. PVLDB 2012
(demo).
2.
Big
data platforms: what's next? Vinayak
R. Borkar, Michael J. Carey, Chen Li. ACM Crossroads
19(1): 44-49, 2012.
3.
qSpell: Spelling Correction of Web Search Queries
using Ranking Models and Iterative Correction. Yasser Ganjisaffar,
Andrea Zilio, Sara Javanmardi,
Inci Cetindil, Manik Sikka,
Sandeep Katumalla, Narges Khatib, Chen Li, Cristina
Lopes, Spelling Alteration for Web Search Workshop, July 2011. [PDF], [Dataset] (The authors won
the third place in the Microsoft's
speller challenge in 2011.)
- The Flamingo Software Package on Approximate String Queries. Chen
Li, DASFAA Workshops 2011, 477. [PDF],
[Source Code]
- Seaform: Search-As-You-Type in Forms, Hao
Wu, Guoliang Li, Chen Li, Lizhu Zhou, VLDB 2010
(Demo). [PDF]
- Search-As-You-Type: Opportunities and Challenges, Chen Li,
Guoliang Li, IEEE Data Eng. Bull. 33(1): 37-45 (2010). [PDF]
- Fuzzy Keyword Search on Spatial Data, Sattam Alsubaiee, Chen
Li: DASFAA, Excellent Demo Award, 2010: 464-467. [PDF], [Demos]
- Efficient top-k algorithms for fuzzy search in string collections,
Rares Vernica, Chen Li, KEYS 2009: 9-14, [PDF], [Talk Slides]
- Efficient Approximate Search on String Collections (Tutorial), Marios Hadjeleftheriou and Chen Li, VLDB 2009. [PDF], [Part I], [Part II].
- Efficient Approximate Search on String Collections (Tutorial), Marios Hadjieleftheriou, Chen Li, ICDE 2009, [PPT-Part1], [PPT-part2].
- Quality-Aware Retrieval of Data Objects from Autonomous Sources
for Web-Based Repositories, Houtan Shirani-Mehr,
Chen Li, Gang Liang, Michal Shmueli-Scheuer, ICDE 2008 (poster). [PDF] [Technical Report]
- Communication-Efficient Query Answering with Quality Guarantees in
Client-Server Applications.
Michal Shmueli-Scheuer, Amitabh
Chaudhary, Avigdor Gal, Chen Li. WebDB 2007. [PDF]
- Quality-Driven Approximate
Methods for GIS Data Integration. Ramaswamy Hariharan, Michal Schmueli-Scheuer, Chen Li, and Sharad Mehrotra. ACM
GIS 2005, November 4-5th, 2005 Bremen,
Germany.
[PDF]
- Answering Aggregation
Queries on Hierarchical Web Sites Using Adaptive Sampling. Foto Afrati, Paraskevas Lekeas, and Chen Li. Technical
Report, UCI ICS, August 2005. A short version
appears in CIKM'2005, 31st October - 5th November, 2005 Bremen,
Germany.
- XGuard:
A System for Publishing XML Documents without Information Leakage in the
Presence of Data Inference. Xiaochun Yang, Chen Li, Ge Yu, and Lei Shi. Proc. of ICDE'2005, demo track, Tokyo,
Japan,
March 2005.
- RACCOON: A Peer-Based
System for Data Integration and Sharing. Chen Li, Jia Li, Qi Zhong. Proc. of
ICDE'2004, demo track. [PDF]
- Schema-Guided Wrapper
Maintenance for Web-Data Extraction. Xiaofeng Meng, Dongdong Hu, Chen Li. To
appear in the Fifth International Workshop on Web Information and Data
Management (WIDM'03), New Orleans, Louisiana.
[PDF] [PPT].
- A Supervised Visual
Wrapper Generator for Web-Data Extraction. . Xiaofeng
Meng, Haiyan Wang, Dongdong Hu, Chen Li. COMPSAC 2003: 657-662. [PDF]
- Using Constraints to
Describe Source Contents in Data Integration Systems. Chen Li. IEEE
Intelligent Systems 18(5): 49-53 (2003). [PDF]
- Describing and Utilizing Constraints
to Answer Queries in Data-Integration Systems. Chen Li. IJCAI 2003
workshop on Information Integration on the Web, August 2003, Acapulco,
Mexico.
[PDF], [PPT]
- Towards Perception-Based
Image Retrieval. Edward Chang, Beitao Li,
and Chen Li. Proceedings of IEEE Workshop on Content-based Access of Image
and Video Libraries, p. 401-412, South Carolina, June, 2000. [PS] [PDF]
- Managing Parallel Disks
for Continuous Media Data. Edward Chang, Chen Li, and Hector
Garcia-Molina. A Book Chapter in Information Organization & Databases,
p.107-120, Kluwer Publisher, 2000. [PS]
[PDF]Answering Queries with
Database Restrictions (Research Summary). Chen Li. Symposium on
Abstraction, Reformulation and Approximation (SARA), pages 328 - 329,
July, 2000, Horseshoe Bay
(Lake LBJ),
Texas.
[PS] [PDF]
- I wrote a report of the Workshop on Data
Mining in the Internet Age, which was held May 1 - 2, 2000, IBM Almaden
Center, San Jose, California. [PS] [PDF]
- Capability Based Mediation
in TSIMMIS. Chen Li, Ramana
Yerneni, Vasilis Vassalos, Hector Garcia-Molina, Yannis Papakonstantinou,
Jeffrey Ullman, Murty Valiveti. Proc. of ACM SIGMOD'98, demo track,
pages 564 - 566, Seattle,
WA, June, 1998. [PS] [PDF]
- HiComm
-- A New Technique for Improving Communication Performance in Workstation
Cluster. Chen Li, Weiqiang Zhuang, Meiming Shen, Dingxing
Wang, Weimin Zheng, Proc. of International
Workshop on Advanced Parallel Processing Technologies (APPT), October,
1995, Beijing, China.
Ph.D.
Thesis
Query Processing and Optimization in
Information-Integration Systems. Chen Li. Ph.D.
Thesis, Computer Science Department, Stanford University, August, 2001.