AvantSearch¶
Provides extended searching and search results capabilities for an Omeka Classic installation.
When this plugin is activated, it dynamically overrides the native Omeka search box with the search box used by AvantSearch. When you deactivate the plugin, the Omeka search box returns.
Differences from Omeka¶
AvantSearch completely overrides Omeka's public search features. It provides its own Advanced Search page and presents search results in a wide variety of ways. It does not affect Omeka's admin search. The table below highlights the differences between AvantSearch and Omeka's native search.
Feature | AvantSearch | Omeka Search |
---|---|---|
Quick search | Yes - Feature | No |
Simple search for All Words | Yes - Feature | No |
Search in Titles only | Yes - Advanced Search page option | No |
Search PDF file text | Yes - Advanced Search page option | No |
Search only items with images or files | Yes - Advanced Search page option | No |
Date range search | Yes - Advanced Search page option | No |
User can specify number of results | Yes - Advanced Search page option | No |
Tabular results | Yes - Feature | No |
Custom Results Layouts | Yes - Congfiguration option | No |
Image View | Yes - Feature | No |
Index View | Yes - Congfiguration option | No |
Grid View | Yes - Congfiguration option | No |
Integer sorting | Yes - Congfiguration option | No |
Address sorting | Yes - Congfiguration option | No |
Lightbox | Yes - Feature | No |
Search by File, Collection, Featured | No | Yes |
Configuration options¶
The AvantSearch plugin has these configuration options:
The following sections describe each option in detail.
See also the documentation for installing AvantSearch.
Address Sorting option¶
This option is only supported by MariaDB and MySQL 8.0. If your server is not running one of these databases, the AvantSearch configuration page will say the option is not available for your installation. If you want to use this option, contact your web host to ask about moving to a server that has MariaDB. If your server is running one of the supporting databases and you are seeing the message that the option is not available for your installation, you'll have to add an element named Address.
Address sorting improves search results by sorting addresses first on the street name and then by the street number as an integer.
Normally addresses are sorted in a database, or in an Excel spreadsheet, as ordinary text where numbers sort before
letters. Furthermore, numbers are normally sorted as text, rather than as integers such that 10
appears before 9
.
Without address sorting:
- 10 Main Street
- 72 Pleasant Lane
- 9 Main Street
With address sorting:
- 9 Main Street
- 10 Main Street
- 72 Pleasant Lane
Columns option¶
Use the Columns option to specify:
- The order of columns from left to right in search results Table View
- An alias for an element's name e.g.
Catalog #
for the Identifier element - The width of a column
- The alignment of column text (left, center, or right)
Syntax:¶
The syntax for each row of the Columns option is
<element-name> [ "," <alias>] [ ":" <width> [ "," <alignment>] ] ]
Where:
<element-name>
is the name of an Omeka element.<alias>
is an optional parameter preceded by a comma to provide another name for element e.g. 'ID' for 'Identifier'.<width>
is an optional parameter preceded by a colon to indicate the width of the element's column in pixels.<alignment>
is an optional parameter preceded by a comma that can only be specified ifwidth
is provided. It specifies the alignment of the column's text asright
,center
, orleft
.
Column Order:¶
The order of columns from left to right in search results Table View is determined as follows: - The order, first to last, in which you specify elements with the Columns option. - For elements that are not specified in the Columns option, the order in which column names appear, top to bottom, and left to right, in the Detail Layout option.
Note that because of the order precedence above, you cannot have columns appear in a specific order in one layout and in a different order in another layout. The reason for this restriction is because the content for all columns is contained in the HTML for the search results Table View; however, only the columns for the selected layout are visible. When you select another layout, the previous layout's columns are hidden and the new layout's columns are made visible. This is what allows instantaneous switching between layouts.
Below is an example specification for the Columns option.
Identifier, Item: 65, right
Title
Type
Subject
Creator
Publisher
Status: 90
Detail Layout option¶
L1 is a special layout referred to as the Detail Layout because it presents a lot of information about an item, including a thumbnail, in a single column. Use the Detail Layout option to specify the elements which appear in the Detail layout.
Syntax:¶
Specify each element name on a separate row.
Below is an example specification of the Detail Layout option.
Identifier
Type
Subject
Creator
Publisher
Date
Place
Address
<score>
<tags>
The pseudo elements <score>
and <tags>
can be used to show scoring
and tags (see the Glossary).
PDF Search option¶
Check the PDF Search checkbox to allow searching of the text of PDF files that are attached to items. The PDF
files must be searchable (born-digital or processed by OCR). When you check the box, the plugin extracts the text
from the PDF files attached to each item and adds it to the search_texts
table record for each item.
By default, the search_texts
table contains only metadata element values. The table is what Omeka uses for keyword searching.
Updating the search_texts
table with PDF text can take a long time if you have many PDF files, so be patient.
You can monitor progress by looking at the log file /plugins/AvantSearch/log-pdf-search.csv
.
The log file gets recreated each time you enable this option.
When you upload a PDF file to an item, or delete a PDF file, the plugin updates the search_texts
table with the
text of whatever PDFs are attached to the item after you save the item.
You can disable PDF searching by unchecking the box for this option, but that alone does not remove the PDF text from the
search_texts
table. To remove the PDF text, you need to rebuild the search_texts
table by going to the Omeka Settings
page and choosing the Search tab. Then click the Index Records button.
Elasticsearch option¶
Check the Elasticsearch checkbox if using AvantElasticsearch. The PDS Search and Elasticsearch options are mutually exclusive -- you can use one or the other or neither, but not both.
Integer Sorting option¶
The Integer Sorting option lets you specify a list of elements for columns that should be sorted as integers instead
of as text. This option ensures that the data in these column is sorted numerically instead of alphabetically.
For example, an alphabetic sort of 14, 116, 127, 1102
results in 1102, 116, 127, 14
because alphabetically,
the character sequence 1102
precedes 14
, 116
, and 127
. Likewise, the characters 14
are greater than the
first two characters in the other three numbers and thus 14
sorts last. The values of elements specified with this
option are converted to integers for sorting purposes.
Below is an example specification of the Integer Sorting option.
Identifier
Box #
Note that you can use the Integer Sorting option for elements with values that only contain integers and also for elements with values that start with integers, but are followed by text. In that case, the text is ignored and the sort is performed only on the integer portion of the value.
Layouts option¶
The Layouts option lets you specify different ways to present search results in Table Vew. The layouts you define here will appear in the Layout Selector and on the Advanced Search page.
Syntax:¶
The syntax for each row of the Layouts option is
<layout-id> "," <layout-name> [ "," <admin> [ ":" <column-name> [ "," <column-name>]- ] ]
Where:
<layout-id>
is 'L' followed by an integer e.g. 'L3'. The numbers do not have to be consecutive from layout to layout. 'L1' is reserved for the Detail Layout described in the next section.<layout-name>
is a short descripion of the layout. It will appear in the Layout Selector list.<admin>
is an optional instance of the word "admin" (without quotes) to indicate that only a logged in user can see and choose this layout in the Layout Selector.<column-name>
is the name of an element that will appear as a column in the layout. Use a comma between column names.
The purpose of the layout Id is to uniquely identify a layout in the query string for Table View page. You can use this query string as a link on web pages to display search results in a specific layout. The Id ensures that those results will appear using the correct layout even if you change the layout's name or its position in the Layouts option list.
Below is an example specification of Layouts.
L1, Summary
L2, Creator/Publisher: Identifier, Title, Creator, Publisher, Date
L3, Type/Subject: Identifier, Title, Subject, Type
L6, Confidential, admin: Identifier, Title, Status, Notes
Notes about the example above: - Each layout begins with an Id and Name - The fourth row also specifies 'admin' - You don't specify columns for the L1 Layout (described in the next section), but you do specify its Name. - In the example, the columns for the other layouts always begin with "Identifier, Title" so that users see those values on every layout. Repeating these columns is a convention, but is not required.
Titles Only option¶
When this option is checked, radio buttons will appear under the keywords text box on the Advanced Search page to let the user choose to search in all fields or in titles only. This feature is very helpful for narrowing search results down to only the most relevant items because titles often contain the most important keywords.
NOTE: If you want to use this option, but the configuration page says it's not available for your installation, you'll need to add a FULLTEXT
index to the title
column of the search_text
table. This is easily done using phpMyAdmin by following these steps:
- Select the 'search_texts' table
- Click the Structure tab
- On the row for the
title
column, click Fulltext among the actions at the far right - Click OK on the dialog confirming that you want to add FULLTEXT to the column
- The
title
column will now appear in the Indexes section showing its type as FULLTEXT (expand the Indexes section if it's not visible)
Improving Search Results¶
The information in this section is only important when using AvantSearch without AvantElasticsearch. AvantElasticsearch uses an entirely different and more effective searching mechanism that is independent of the underlying MySQL or MariaDB database.
The AvantSearch plugin will work without any modifications to your database. However, read this section to learn how you can improve search results by changing just one setting.
Like Omeka's native search, AvantSearch performs keyword searches using the Omeka search_texts
table.
The Omeka installer creates this table using the MyISAM storage engine. You will get much better results
from keyword searches by changing the table to use the InnoDB storage engine because MyISAM negatively
affects keyword searches in two ways:
- MyISAM uses a very long list of stopwords.
- MyISAM's default settings ignores keywords of three characters or less (ft_min_word_len).
With MyISAM a search for road+ map+
will ignore map
and thus return all items containing road
instead of only
those items containing road
AND map
. Additionally, the MyISAM stopword list contains so many words that people
commonly search for that users are often surprised when items don't appear in search results.
In contrast, InnoDB has a very short list of stopwords and only ignores keywords that are two characters or less (innodb_ft_min_token_size). Although you can change the value of ft_min_word_len to 3, this variable can only be set at the MySQL server level and a server restart is required to change them. If you are using a shared server, you probably don't have the option to change this value.
Learn how to change from MyISAM to InnoDB.
Dependencies¶
The AvantSearch plugin requires that the AvantCommon plugin be installed and activated.
Installation¶
To install the AvantSearch plugin, follow these steps:
- First install and activate the AvantCommon plugin.
- Download the latest release from https://github.com/gsoules/AvantSearch
- Unzip
AvantSearch-master.zip
into your Omekaplugins
folder - Rename the folder to
AvantSearch
- Activate the plugin from the Omeka
Plugins
page
Warning¶
Use this software at your own risk.
License¶
This plugin is published under [GNU/GPL].
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 3 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
Copyright¶
Credits¶
Inspiration for the Index View search results came from the alphabetized index and hierarchical list features in the Daniel-KM / Reference plugin.