Download Xapian / Omega 1.0.3

Spread the love

Xapian is an open source information retrieval library written in C++ and can be used as an engine behind a search engine. It includes a proprietary database format, APIs to edit and search databases, tools to verify databases, and linking capabilities for other languages ​​such as Java, Ruby, PHP, and Python. An application that can be used on top of Xapian is Omega, a search engine to search Xapian databases. Omega also includes some tools that can be used to populate databases with data. Since Omega’s development is closely linked to Xapian’s own development, the developers of both programs are simultaneously releasing new versions with the same version number.

The development team at The Xapian Project has released version 1.0.3 of Xapian and Omega. The lists of changes for the various components are as follows:

Xapian core 1.0.3:

API:

  • Add support for user specified metadata (bug#143). Currently supported by the flint and inmemory backends.
  • Deprecate Inquire::register_match_decider() which has always been a no-op.
  • Improve the lower bound on the number of matching documents for an AND query – if the sum of the lower bounds for the two sides is greater than the number of documents in the database, then some of them must have both terms.
  • Spelling correction: Fix off-by-one error in loop bounds when initialising (bug#194).
  • If the check_at_least parameter to Inquire::get_mset() is used, but there aren’t that many results, then MSet::get_matches_lower_bound() and MSet::get_matches_upper_bound() weren’t always reported as equal – this bug is now fixed .
  • When sorting by value, and using the check_at_least parameter to Inquire::get_mset(), some potential matches weren’t being counted.
  • Failing to create a flint or quartz database because we couldn’t create the directory for it now throws DatabaseCreateError not DatabaseOpeningError.

test suite:

  • Fix display of valgrind output when a test fails because valgrind detected a problem.
  • Add another version of valgrind suppression for the zlib end condition check as this gives a different backtrace for zlib in Ubuntu gutsy.

flint back end:

  • The Flint database format has been extended to support user metadata, and each termlist entry is now a byte shorter (before compression). As a result, Xapian 1.0.2 and earlier won’t be able to read Xapian 1.0.3 databases. However, Xapian 1.0.3 can read older databases. If you open an older flint database for writing with Xapian 1.0.3, it will be upgraded such that it cannot then be read by Xapian 1.0.2 and earlier.
  • Zlib compression wasn’t being used for the spelling or synonym tables (due to a typo – Z_DEFAULT_COMPRESSION where it should be Z_DEFAULT_STRATEGY).
  • xapian-check: Allow “db/record.” and “db/record.DB” as arguments.
  • Fix “key too long” exception message by substituting FLINT_BTREE_MAX_KEY_LEN with its numeric value.
  • Assorted minor efficiency improvements.
  • If we reach the flush threshold during a transaction, we now write out the postlist changes, but don’t actually commit them.
  • Check length of new terms is at most 245 bytes for flint in add_document() and replace_document() so that the API user gets an error there rather than when flush() is called (explicitly or implicitly). Fixes bug#44.
  • Flint used to read the value of the environmental variable XAPIAN_FLUSH_THRESHOLD when the first WritableDatabase was opened and would then cache this value. However the program using Xapian may have changed it, so we now reread it each time a WritableDatabase is opened.
  • Implement TermIterator::positionlist_count() for the flint backend.

remote backend:

  • Fix the result of MSet::get_matches_lower_bound() when using the check_at_least parameter to get_mset().

in-memory backend:

  • Implement TermIterator::positionlist_count() for the inmemory backend.

build system:

  • xapian-config: We always need to include dependency_libs in the output of `xapian-config –libs` if shared libraries are disabled.
  • Distribution tarballs are now in the POSIX “ustar” format. This supports pathnames longer than 99 characters (which we now have a few instances of in the doxygen generated documentation) and also results in a distribution tarball that is about half the size! This format should be readable by any tar program in current use – if your tar program doesn’t support it, we’d like to know (but note that the GNU tar tarball is smaller than the size reduction in the xapian-core tarball. ..)
  • configure no longer generates msvc/version.h – this is now entirely handled by the MSVC-specific makefiles.

documentation:

  • Add a glossary.
  • docs/voting.html: Reorder the initial paragraphs so we actually answer the question “What is a voting algorithm?” upfront.
  • When running rst2html, use “–exit-status=warning” rather than “–strict”. The former actually gives a non-zero exit status for a warning or worse, while the former doesn’t, but does include any “info” messages in the output HTML.
  • docs/deprecation.rst: Add “Database::positionlist_begin() throwing RangeError and DocNotFoundError”.
  • valueranges.rst: Correct out-of-date reference to float_to_string.
  • HACKING: Document a few more “coding standards”.
  • PLATFORMS: Updated.
  • docs/overview.html: Restore HTML header accidentally deleted in November 2006.
  • Fix several typos.

portability:

  • Add missing instances of “#include ” to fix compilation with recent GCC 4.3 snapshots.
  • Fix some warnings for various compilers and platforms.

Omega 1.0.3:

general:

  • Distribution tarballs are now in the POSIX “ustar” format since it saves a few KB and we need to use it for xapian-core anyway.

documentation:

  • Expand the output of ‘mbox2omega –help’ and refer the reader to it from docs/scriptindex.txt.

indexers:

  • omindex:
    • Add support for indexing AbiWord documents and TeX DVI files.
    • Impose a 5 minute CPU time limit on filter programs to prevent problems if a filter program goes into an infinite loop on a malformed input. Partially addresses bug#111.
  • script index:
    • Fix line number tracking in dump files.

omega:

  • Add $muldiv{A,B,C} which calculates int(A*B/C).
  • Fix bug in decimal fraction in $size for files >= 1M in size.

template:

  • query:
    • Set HTML charset to utf-8 since that’s what databases now are by default.
    • Restyle to use CSS to draw a “score bar” instead of using images.
    • Rework the layout of each hit.
    • Add popup hints on mouse-over for various items.
    • Tidy up some HTML gremlins.

Xapian bindings 1.0.3:

General:

  • Wrap new methods Database::get_metadata() and WritableDatabase::set_metadata().
  • “make uninstall” now removes the loadable module we install for each of the bindings.
  • “make distcheck” now works.
  • Distribution tarballs are now in the POSIX “ustar” format since it saves about 40KB and we need to use it for xapian-core anyway.

Packaging:

  • RPMs: Package xapian.php.

CSharp:

  • Remove wrapper for ValueRangeProcessor::operator(), since it can’t be usefully used currently.

Java:

  • Remove wrappers for the Muscat36 backend, which has now been dropped from the C++ library.
  • “make clean” now removes the class files generated for inner classes.

PHP:

  • Add feature test for DateValueRangeProcessor when used with QueryParser.
  • ValueRangeProcessor::apply() can now be called from PHP (bug#193). This isn’t actually very useful, since you can’t subclass it in PHP currently.
  • Fixed wrapping of Inquire::set_cutoff() – previously this would only work if the third parameter was specified and a floating point number (eg 0.0).
  • php/docs/bindings.html: Fix errors in example code.

Python:

  • ValueRangeProcessor::operator() is now wrapped as a __call__ method in Python which takes two strings and returns a 3-tuple (value_number, modified_begin, modified_end). Previously this always failed with a type error, so this doesn’t break existing code.
  • python/pythontest.py: Interpret any commandline arguments as a list of tests to be run (the default is to run all tests).
  • README,python/docs/bindings.html: Add a note about the problems with mod-python (as described in bug #185).
  • python/pythontest.py: Delete the database handles before deleting a database to fix problems running the Python tests on MS Windows (bug#179).
  • “make clean” now removes testsuite.pyc.

ruby:

  • Check for RUBY_INC, RUBY_LIB, and RUBY_LIB_ARCH in the environment or on the configure command-line. The defaults for RUBY_LIB and RUBY_LIB_ARCH are now the site-specific directories, which is more correct when building from source. Debian packages, etc can override this by setting RUBY_LIB and RUBY_LIB_ARCH.

Tcl:

  • Check for TCL_LIB in the environment or on the configure command-line to allow installing without root access more cleanly.

[break]The following files can be downloaded:
Xapian 1.0.3
Omega 1.0.3
Xapian bindings 1.0.3

Version number 1.0.3
Release status Final
Operating systems Windows 9x, Windows NT, Windows 2000, Linux, BSD, Windows XP, macOS, Solaris, UNIX, Windows Server 2003, Windows Vista
Website The Xapian Project
Download
License type GPL
You might also like
Exit mobile version