Revision history for Search Syntax

UserDateBody
Sonata Dusk<p><i>(Note: Directly kipped from Derpibooru for now, updating examples and making the text easier to read comes later. - Sonata, May 20th 2020)</i> <h2 id="about">About Search Syntax</h2> <p> The search engine allows users to locate uploads via tags that are associated with an upload and some other metadata such as uploader and user count. It also permits chaining together specified tags and metadata to search for specific logical combinations, to allow more precise filtering. This guide explains the syntax and features of individual terms and then shows how they are combined into more complex queries. </p> <ol> <li><a href="#about">About Search Syntax</a> </li> <li> <a href="#terms">Search Terms</a> <ol> <li><a href="#tag-behavior">Tag Search Behavior</a> </li> <li><a href="#fields">Searching Through Other Fields</a> </li> <li><a href="#numeric-range">Numeric Range Queries</a> </li> <li><a href="#date-range">Date/Time Range Queries</a> </li> <li><a href="#supported-fields">Supported Fields</a> </li> </ol> </li> <li> <a href="#special-characters">Special Characters and Suffixes</a> <ol> <li><a href="#wildcards">Wildcards</a> </li> <li><a href="#escaping">Escaping Special Characters</a> </li> <li><a href="#fuzzy">Approximate (Fuzzy) String Matching</a> </li> </ol> </li> <li> <a href="#grammar">Search Grammar: Term Operators and Combinations</a> <ol> <li><a href="#expressions">Expressions</a> </li> <li><a href="#expressions-summary">Summary Table</a> </li> <li><a href="#negation">Negation</a> </li> <li><a href="#and-expressions">Commas and AND Expressions</a> </li> <li><a href="#or-expressions">OR Expressions</a> </li> <li> <a href="#compound">Compound Expressions</a> <ol> <li><a href="#precedence">Operator Precedence</a> </li> <li><a href="#parentheses">Defining Subexpressions with Parentheses</a> </li> <li><a href="#auto-escaping">Automatic Parentheses Escaping</a> </li> </ol> </li> </ol> </li> <li><a href="#boosting">Boosting Terms</a> </li> </ol> <h2 id="terms">Search Terms</h2> <p> Specific searches require the inclusion of search terms, which individually define the criteria expected of each upload result to be returned by the search engine. </p> <h3 id="tag-behavior">Tag Search Behavior</h3> <p> Searching a single term is obvious: merely type in the term you want. By default, the term you use will be searched among the indexed image tags and aliases. Thus, a search for <code><a href="/search?q=pinkie+pie">pinkie pie</a></code> would, as you may surmise, result in all appropriately tagged and indexed pictures of Pinkie Pie. Aliases are also indexed, so a search for the tag alias <code><a href="/search?q=ts">ts</a></code> is the same as one for <code><a href="/search?q=twilight+sparkle">twilight sparkle</a></code>. </p> <p> The default tag search has particular aspects associated with it for your convenience. For tag searches, case is insensitive. This means capitalization is irrelevant for queries. For example, the search queries <code><a href="/search?q=pinkie+pie">pinkie pie</a> </code> and <code><a href="/search?q=Pinkie+Pie">Pinkie Pie</a> </code> will share the same result set. </p> <h3 id="fields">Searching Through Other Fields</h3> <p> Other fields are also indexed, and you can search them using the namespace convention that is also used by tags. Namely, one enters the field name followed by a colon, and finally, the target value. For example, to search for images with a width of 1920, we would search within the <code>width</code> field and so construct the query <code><a href="/search?q=width%2a1920">width:1920</a></code>. If a tag with namespace were to share the namespace with a given field, it can still be queried via quoting or escaping. </p> <h3 id="numeric-range">Numeric Range Queries</h3> <p> Numeric fields in particular support queries for ranges of possible values. A qualifier can be added to the end of the field name with a single period to indicate desired results that are greater than or less than the supplied value; the value can be optionally included, too. To find images with a score greater than 100, we would enter <code><a href="/search?q=score.gt%2a100">score.gt:100</a></code>. For an inclusive search of scores greater than <em>or equal to</em> 100, we would instead enter <code><a href="/search?q=score.gt%2a100">score.gte:100</a></code>. The following table enumerates the supported qualifiers. </p> <table class="table"> <thead> <tr> <th>Qualifier</th> <th>Meaning</th> <th>Example</th> </tr> </thead> <tbody> <tr> <td> <code>gt</code> </td> <td> Values greater than specified, and not including the specified value </td> <td> <code><a href="/search?q=score.gt%3a100">score.gt:100</a> </code></td> </tr> <tr> <td> <code>gte</code> </td> <td> Values greater than or equal to specified </td> <td> <code><a href="/search?q=score.gte%3a100">score.gte:100</a> </code></td> </tr> <tr> <td> <code>lt</code> </td> <td> Values less than specified, and not including the specified value </td> <td> <code><a href="/search?q=width.lt%3a100">width.lt:100</a> </code></td> </tr> <tr> <td> <code>lte</code> </td> <td> Values less than or equal to specified </td> <td> <code><a href="/search?q=width.lte%3a100">width.lte:100</a> </code></td> </tr> </tbody> </table> <h3 id="date-range">Date/Time Range Queries</h3> <p> Date and time values are specified using a tweaked subset of the <a href="https://en.wikipedia.org/wiki/ISO_8601">ISO 8601 standard</a>. A full date is specified by four-digit year, followed by two-digt month and date, with each value delimited by a hyphen, i.e., "YYYY-mm-DD". Like in ISO 8601, one can specify just the month or even just the year, as long as the less precise information is included in left-to-right order without dangling hyphens. This is semantically interpreted as the range of the entire period (not just the first day of the month, etc.). For example, <code>2015-04</code> represents the entire month of April 2015. </p> <p> Given a full date, a specification for the time of day can be added. To do so, separate the time with a <code>T</code> or space, followed by the hours, minutes, and seconds, each specified ' with two digits and separated by a colon, i.e., "HH:MM:SS". The hours follow a 24-hour clock. As with date values, one may alternatively specify entire minutes and even hours by truncating the value without a dangling colons. The value <code>2014-04-20 16</code> represents the entire hour of 4 PM on 20 April 2014 (UTC). The entire first minute can be specified with <code>2014-04-20 16:00</code>. </p> <p> By default, time follows international UTC ("Zulu") time. (In terms of the ISO 8601 standard, a <code>Z</code> suffix is implied.) One may specify an offset for local time by affixing a plus or minus sign, followed by the offset hours as two digits, a colon, and the offset minutes (usually <code>00</code>), e.g., <code>-04:00</code> for US Eastern Daylight Time (EDT). Note that unlike ISO 8601, this can be attached to dates as well as times, to ensure date boundaries fit the locale of interest. For example, <code>2015-05:00</code> represents the year of 2015 with an offset of minus five hours (US Eastern Standard Time). </p> <p> Date/time range queries also accept range qualifiers. The <code>gt</code> and <code>lt</code> qualifiers omit everything matching the implied time range of the specified value, whereas <code>gte</code> and <code>lte</code> include the entirety of said time range. </p> <p> The following examples are valid search queries. </p> <table class="table"> <thead> <tr> <th>Example</th> <th>Explanation</th> </tr> </thead> <tbody> <tr> <td> <code><a href="/search?q=created_at%3a2015">created_at:2015</a> </code></td> <td>Returns all uploads made in 2015 (UTC).</td> </tr> <tr> <td> <code><a href="/search?q=created_at%3a2015%2b08:00">created_at:2015+08:00</a> </code></td> <td>Returns all uploads made in 2015 (SGT).</td> </tr> <tr> <td> <code><a href="/search?q=created_at%3a2015-04">created_at:2015-04</a> </code></td> <td>Returns all uploads made in April 2015 (UTC).</td> </tr> <tr> <td> <code><a href="/search?q=created_at%3a2015-04-03%3a00">created_at:2015-04-03:00</a> </code></td> <td>Returns all uploads made in April 2015 (BRT).</td> </tr> <tr> <td> <code><a href="/search?q=created_at%3a2015-04-01">created_at:2015-04-01</a> </code></td> <td>Returns all uploads made in 1 April 2015 (UTC).</td> </tr> <tr> <td> <code><a href="/search?q=created_at%3a2015-04-01%2b08%3a00">created_at:2015-04-01+08:00</a> </code></td> <td>Returns all uploads made in 1 April 2015 (SGT).</td> </tr> <tr> <td> <code><a href="/search?q=created_at%3a2015-04-01+01">created_at:2015-04-01 01</a> </code></td> <td>Returns all uploads made in the hour of 1 AM of 1 April 2015 (UTC).</td> </tr> <tr> <td> <code><a href="/search?q=created_at%3a2015-04-01+01Z">created_at:2015-04-01 01Z</a> </code></td> <td> Returns all uploads made in the hour of 1 AM on 1 April 2015 (UTC). The zero UTC offset designator ("Zulu") is explicit. </td> </tr> <tr> <td> <code><a href="/search?q=created_at%3a2015-04-01T01Z">created_at:2015-04-01T01Z</a> </code></td> <td> Returns all uploads made in the hour of 1 AM on 1 April 2015 (UTC). This uses the standard "T" separator associated with ISO 8601. </td> </tr> <tr> <td> <code><a href="/search?q=created_at%3a2015-04-01+01-04%3a00">created_at:2015-04-01 01-04:00</a> </code></td> <td>Returns all uploads made in the hour of 1 AM on 1 April 2015 (EDS).</td> </tr> <tr> <td> <code><a href="/search?q=created_at%3a2015-04-01+01%3a00">created_at:2015-04-01 01:00</a> </code></td> <td>Returns all uploads made sometime in the minute of 1:00 AM on 1 April 2015 (UTC).</td> </tr> <tr> <td> <code><a href="/search?q=created_at%3a2015-04-01+01%3a00Z">created_at:2015-04-01 01:00Z</a> </code></td> <td> Returns all uploads made sometime in the minute of 1:00 AM on 1 April 2015 (UTC). The zero UTC offset designator ("Zulu") is explicit. </td> </tr> <tr> <td> <code><a href="/search?q=created_at%3a2015-04-01+00%3a00%3a00">created_at:2015-04-01 00:00:00</a> </code> <td>Returns all uploads made exactly at midnight on 1 April 2015 (UTC).</td> </td> </tr> <tr> <td> <code><a href="/search?q=created_at%3a2015-04-01+00%3a00%3a00%2b08%3a00">created_at:2015-04-01 00:00:00+08:00</a> </code></td> <td>Returns all uploads made exactly at midnight on 1 April 2015 (SGT).</td> </tr> <tr> <td> <code><a href="/search?q=created_at.lt%3a2015">created_at.lt:2015</a> </code></td> <td>Returns all uploads before the start of 2015 (UTC).</td> </tr> <tr> <td> <code><a href="/search?q=created_at.gte%3a2015-04-04">created_at.gte:2015-04-04</a> </code></td> <td>Returns all uploads since and including the entire day of 4 April 2015 (season 5 premiere, UTC).</td> </tr> </tbody> </table> <h3 id="supported-fields">Supported Fields</h3> <p> The following table enumerates all of the supported fields, with examples. </p> <table class="table"> <thead> <tr> <th>Field Selector</th> <th>Type</th> <th>Description</th> <th>Example</th> </tr> </thead> <tbody> <tr> <td> <code>aspect_ratio</code> </td> <td>Numeric Range</td> <td>Matches any image with the specified aspect ratio.</td> <td> <code><a href="/search?q=aspect_ratio%3a1">aspect_ratio:1</a> </code></td> </tr> <tr> <td> <code>comment_count</code> </td> <td>Numeric Range</td> <td>Matches any image with the specified number of comments</td> <td> <code><a href="/search?q=comment_count.gt%3a50">comment_count.gt:50</a> </code></td> </tr> <tr> <td> <code>created_at</code> </td> <td>Date/Time Range</td> <td>Matches any image posted at the specified date and/or time.</td> <td> <code><a href="/search?q=created_at%3a2015-04-01">created_at:2015-04-01</a> </code></td> </tr> <tr> <td> <code>description</code> </td> <td>Full Text</td> <td> Full-text search against image descriptions with the specified string. </td> <td> <code><a href="/search?q=description%3aderp">description:derp</a> </code></td> </tr> <tr> <td> <code>downvotes</code> </td> <td>Numeric Range</td> <td>Matches any image with the specified downvote count.</td> <td> <code><a href="/search?q=downvotes%3a0">downvotes:0</a> </code></td> </tr> <tr> <td> <code>faved_by</code> </td> <td>Literal</td> <td>Matches any image favorited by the specified user. Case-insensitive.</td> <td> <code><a href="/search?q=faved_by%3aroboshi">faved_by:roboshi</a> </code></td> </tr> <tr> <td> <code>faves</code> </td> <td>Numeric Range</td> <td>Matches any image with the specified number of favorites.</td> <td> <code><a href="/search?q=faves%3a20">faves:20</a> </code></td> </tr> <tr> <td> <code>height</code> </td> <td>Numeric Range</td> <td>Matches any image with the specified height.</td> <td> <code><a href="/search?q=height%3a1080">height:1080</a> </code></td> </tr> <tr> <td> <code>id</code> </td> <td>Numeric Range</td> <td>Matches any image with the specified number.</td> <td> <code><a href="/search?q=id%3a111111">id:111111</a> </code></td> </tr> <tr> <td> <code>orig_sha512_hash</code> </td> <td>Literal</td> <td> Matches the <em>original</em> SHA-512 checksum of an uploaded image. </td> <td> <code><a href=""></a> </code></td> </tr> <tr> <td> <code>score</code> </td> <td>Numeric Range</td> <td>Matches any image with the specified net score.</td> <td> <code><a href="/search?q=score.gt%3a200">score.gt:200</a> </code></td> </tr> <tr> <td> <code>sha512_hash</code> </td> <td>Literal</td> <td> Matches any image with the specified SHA-512 checkusm. N.B.: Image optimization usually alters the original checksum! </td> <td> <code><a href=""></a> </code></td> </tr> <tr> <td> <code>source_url</code> </td> <td>Literal</td> <td> Matches image source URLs. Case-insensitive. </td> <td> <code><a href="/search?q=source_url%3a%2adeviantart.com%2a">source_url:*deviantart.com*</a> </code></td> </tr> <tr> <td> <code>tag_count</code> </td> <td>Numeric Range</td> <td>Matches any image with the specified number of tags</td> <td> <code><a href="/search?q=tag_count.gt%3a10">tag_count.gt:10</a> </code></td> </tr> <tr> <td> <code>uploader</code> </td> <td>Literal</td> <td>Matches any image with the specified uploader account. Case-insensitive.</td> <td> <code><a href="/search?q=uploader%3ak_a">uploader:k_a</a> </code></td> </tr> <tr> <td> <code>upvotes</code> </td> <td>Numeric Range</td> <td>Matches any image with the specified upvote count.</td> <td> <code><a href="/search?q=upvotes.gt%3a200">upvotes.gt:200</a> </code></td> </tr> <tr> <td> <code>width</code> </td> <td>Numeric Range</td> <td>Matches any image with the specified width.</td> <td> <code><a href="/search?q=width%3a1920">width:1920</a> </code></td> </tr> <tr> <td> <code>wilson_score</code> </td> <td>Numeric Range</td> <td>Matches any image with the specified lower bound of a 99.5% Wilson CI.</td> <td> <code><a href="/search?q=wilson_score.gt%3a0.9">wilson_score.gt:0.9</a> </code></td> </tr> </tbody> </table> <p> It is worth noting the absence of certain &ldquo;fields&rdquo; such as <code>artist</code> and <code>spoiler</code>. These are <em>tag namespaces,</em> not metadata, but they are functionally the same. Thus, a search for <code><a href="/search?q=spoiler%3as04">spoiler:s04</a> </code> performs as expected. </p> <h2 id="special-charcters">Special Characters and Suffixes</h2> <h3 id="wildcards">Wildcards</h3> <p> Wildcards allow for matching with terms that begin with, end with, or contain a given string of characters, like wildcards used in file management. Two wildcards are recognized: the asterisk (or star) and the question mark. </p> <p> An asterisk "expands" or matches to any number of characters in its place, including 0. For example, <code><a href="/search?q=apple%2a">apple*</a> </code> matches to uploads with any of the tags <code>apple bloom</code>, <code>applejack</code>, and simply <code>apple</code>. </p> <p> A question mark matches to a single character in its place. For example, <code><a href="/search?q=t%3fixie">t?ixie</a> </code> can match to either <code>trixie</code> or <code>twixie</code>. </p> <table class="table"> <thead> <tr> <th>Wildcard Character</th> <th>Match</th> </tr> </thead> <tbody> <tr> <td> <code>*</code> </td> <td>Zero or more characters</td> </tr> <tr> <td> <code>?</code> </td> <td>A single character</td> </tr> </tbody> </table> <h3 id="escaping">Escaping Special Characters</h3> <p> The use of special characters that modify search terms or exist outside search terms mandates a facility for &ldquo;escaping&rdquo; those characters, so that they are not excluded from search terms themselves. To use special characters within a search term, both of the conventional string escaping mechanisms are used: the backslash and quoting. The following are special characters and sequences that may need to be escaped: </p> <ul> <li> <code>( </code></li> <li> <code>) </code></li> <li> <code>*</code> </li> <li> <code>?</code> </li> <li> <code>-</code> (when placed in front of a term) </li> <li> <code>!</code> (when placed in front of a term) </li> <li> <code>,</code> </li> <li> <code>&&</code> </li> <li> <code>||</code> </li> <li> <code>OR</code> (if all-capitalized) </li> <li> <code>AND</code> (if all-capitalized) </li> <li> <code>NOT</code> (if all-capitalized) </li> <li> <code>"</code> </li> <li> <code>\</code> </li> <li> <code>~</code> (with <a href="#fuzzy">fuzzy matching syntax</a>) </li> <li> <code>^</code> (with <a href="#boosting">boosting</a>) </li> </ul> <p> A backslash is placed in front of a special character (and can also be placed in front of a sequence like the ones in the preceding list). This forces a given character to be counted as part of the preceding or following term. In front of any other character, it effectively has no effect. For example, <code><a href="/search?q=%5c%2d_%2d">\-_-</a> </code> forces a search for the emoticon <code>-_-</code>, despite it following the syntax for <code><a href="#negation">negation</a> </code> if without the backslash. Also consider the search term <code><a href="/search?q=rose+%5c%28flower%5c%29">rose \(flower\)</a></code>, although parentheses have intuitive rules that do not make escaping them necessary in most cases. The backslash is a special character and thus must also be escaped; a literal backslash is indicated with <code>\\</code>. </p> <p> The alternative to escaping is to simply surround the search query in double quotes (<code>"</code>), e.g., <code><a href="/search?q=%22rose+%28flower%29%22">"rose (flower)"</a></code>. When searching with a specified field, quotes <strong>must surround the field and colon as well</strong>, e.g., <code><a href="/search?q=%22faved_by%3ak_a%22">"width:1920"</a></code>. Eveything in quotes is together treated as a verbatim search term, with one exception. Note that the double quote character itself bounds the search term, so if it appears inside, it must be escaped with a backslash. <strong>All other uses of backslash are treated literally.</strong> </p> <h3 id="fuzzy">Approximate (Fuzzy) String Matching</h3> <p> The search engine backend, Apache Lucene, also enables so-called &ldquo;fuzzy&rdquo; string matching. Fuzzy string matching can be used with any literal search term, including the default tags field. A fuzzy match is specified using a similarity metric either ranging from 0 to 1.0 or a whole number. The whole number specifies an <i>optimal string alignment edit distance,</i> which is the maximum number of edits done to a string to match a given target, with an edit defined as a deletion, insertion, replacement, or switching two adjacent characters. One may alternatively define a similarity factor ranging from 0 to 1.0, with a 1.0 the least &ldquo;fuzzy&rdquo;. The derived edit distance is the length of the term sans the field name prefix, multiplied by the difference of unity minus the similarity factor, all rounded down. To specify either, a term is followed with a tilde followed by the edit distance or similarity factor. <strong> Note in both cases that Lucene caps the maximum edit distance at 2, as an optimization. Therefore, very large edit distances or small similarities will not behave as expected. </strong></p> <p> For example, <code><a href="/search?q=fluttersho%7e1.0">fluttersho~0.8</a> </code> searches for uploads with tags that approximately match <code>fluttersho</code>, with a similairty of 0.8. This is an edit distance of &lfloor;(1 &minus; 0.8)(10)&rfloor; = 2. Note that uploads tagged <code>fluttershy</code> are included in the result set. The utility of this is obvious: if you are unsure of a character or tag's exact spelling, you can use this as an aid, like a more manual and controlled version of Google's (in)famous spelling correction features. </p> <p> Fuzziness can also be applied to numeric queries to specify a range. In this case, the fuzziness parameter is the magnitude above and below the specified number that will be included in the result set. For example, <code><a href="/search?q=width%3a800%7e200">width:800~200</a> </code> specifies images with a width ranging from 600 (800 &minus; 200) to 1000 (800 &plus; 200), inclusive. </p> <p> Fuzzy matching can be freely applied to any term inside an <code><a href="#expressions">expression</a></code>. </p> <h2 id="grammar">Search Grammar: Term Operators and Combinations</h2> <h3 id="expressions">Expressions</h3> <p> Terms can be combined to define a search query corresponding to a specific result set. These combinations are formulized as <b>expressions</b> that are constructed from terms, operators, and even other expressions, which are then called <strong>subexpressions</strong>. Expressions recognized by the search frontend are the negation of a term or subexpression, the requirement of any search term or subexpression, or the requirement of both search terms or subexpressions. </p> <p> At its core, a search expression is either <strong>binary</strong> or <strong>unary</strong>. A binary expression consists of a term or subexpression, an <strong>operator</strong> indicating the type of expression, and another term or subexpression. Binary expressions can be &ldquo;chained&rdquo; by adding the operator followed by another term. A unary expression consists of the operator followed by a single term or subexpression. Both expression types and how to use subexpressions will be covered in the following sections. </p> <h3 id="expressions-summary">Summary Table</h3> <table class="table"> <thead> <tr> <th>Operator</th> <th>Symbols</th> <th>Comments</th> </tr> </thead> <tbody> <tr> <td>Negation (NOT)</td> <td> <ul> <li> <code>NOT</code> </li> <li> <code>-</code> </li> <li> <code>!</code> </li> </ul> </td> <td> Applied in front of a single term or parenthesized subexpression. The minus sign does not require padding to the right. Specifies that the term or subexpression <em>must not</em> match. </td> </tr> <tr> <td>Conjunction (AND)</td> <td> <ul> <li> <code>,</code> </li> <li> <code>&&</code> </li> <li> <code>AND</code> </li> </ul> </td> <td> Applied between two terms. The comma may be optionally padded with space on either side; the other forms must be padded. Specifies that both terms match. Can be chained to more terms. </td> </tr> <tr> <td>Disjunction (OR)</td> <td> <ul> <li> <code>||</code> </li> <li> <code>OR</code> </li> </ul> </td> <td> Applied between two terms, with surrounding space. Specifies that either of the terms match. Can be chained to more terms. </td> </tr> </tbody> </table> <h3 id="negation">Negation</h3> <p> <b>Negation</b> of a term or expression specifies that the the original term or subexpression <em>must not</em> match. The corresponding negation operator is <b>unary</b>, that is, applied to either a single term or to a subexpression. It is specified with the all-capitalized word <code>NOT</code>, a dash of the non-multi-chromatic variety (<code>-</code>), or an exclamation point (<code>!</code>). For example, <code><a href="/search?q=%2dfluttershy">-fluttershy</a></code> or <code><a href="/search?q=NOT+fluttershy">NOT fluttershy</a></code> matches pictures that are <em>not</em> tagged with <code>fluttershy</code>. In set theory terms, this is taking the <em>complement</em> of the original result set, that is, all uploads outside it. </p> <h3 id="and-expressions">Commas and AND Expressions</h3> <p> An expression that queries for images that meet <em>all</em> specified terms is a <b>conjunction</b> or <b>AND expresssion</b>. As in the past, you can query images that meet a list of terms by hooking the terms together with commas. For example, <code><a href="/search?q=fluttershy%2cpinkie+pie">fluttershy,pinkie pie</a> </code> results in pictures that contain <em>both</em> the <code>fluttersy</code> and <code>pinkie pie</code> tags. In set theory terms, the result set is the intersection of uploads tagged <code>fluttershy</code> and uploads tagged <code>pinkie pie</code>. </p> <p> Commas can be padded with spaces however you like. Unlike the past, commas are now plain AND operators, so they are more versatile. As will be discussed, they can be used in subexpressions and alongside the OR operator. </p> <p> AND operators can also be expressed using <code>&&</code> (derived from typical programming notation) or the all-capitalized word <code>AND</code>, e.g., <code><a href="/search?q=rarity+%26%26+pinkie+pie">rarity && pinkie pie</a></code> or <code><a href="/search?q=rarity+AND+pinkie+pie">rarity AND pinkie pie</a></code>. These forms, unlike the comma, require padding space on either side. </p> <h3 id="or-expressions">OR Expressions</h3> <p> A <b>disjunction</b> or <b>OR expression</b> requests for uploads that meet <em>any</em> of the specified search terms. This is markedly different from the aforementioned AND expression, which, to reiterate, mandates that <em>all</em> terms match. OR operators are expressed either with <code>||</code> (also a programming notation) or the all-capitalized word <code>OR</code>, e.g., <code><a href="/search?q=rarity+%7c%7c+pinkie+pie">rarity || pinkie pie</a></code> or <code><a href="/search?q=rarity+OR+pinkie+pie">rarity OR pinkie pie</a></code>. In set theory terms, the result set is the union of uploads tagged <code>rarity</code> and uploads tagged <code>pinkie pie</code>. All forms of the OR operator require padding on either side. </p> <h3 id="compound">Compound Expressions</h3> <p> Complex combinations of terms, and therefore search criteria, are possible by combining expressions together. Doing so effectively is analogous to arithmetic. Consider multiplication and addition (which in so-called <em>Boolean alegra</em> are respectively analogous to AND and OR operations). We can express an algebraic expression with multiplication and addition several ways. For three terms, <i>A</i>, <i>B</i>, and <i>C</i>, consider the expression <i>A</i> &times; <i>B</i> &plus; <i>C</i>. Multiplication is evaluated before addition, so this expression is equivalent to (<i>A</i> &times; <i>B</i>) &plus; <i>C,</i> in which case the order of operations is explicit. </p> <h4 id="precedence">Operator Precedence</h4> <p> Likewise, precedence is applied to determine the order in which chained OR, AND, and NOT operations are evaluated. The order of operations in the search syntax is as follows: </p> <ol> <li>negation (NOT)</li> <li>conjunction (AND)</li> <li>disjunction (OR)</li> </ol> <p> Consider the query <code><a href="/search?q=twilight+sparkle+%7c%7c+fluttershy+%26%26+pinkie+pie">twilight sparkle || fluttershy && pinkie pie</a></code>. In this example, <code>fluttershy && pinkie pie</code> is evaluated first, as an implicit <em>subexpression.</em> Then, that result is OR'd together with <code>twilight sparkle</code>. Thus, the query instructs the engine to return uploads <em>either</em> tagged with <code>twilight sparkle</code> <em>or</em> tagged with <em>both</em> <code>fluttershy</code> <em>and</em> <code>pinkie pie</code>. Note how if the OR expression <code>twilight sparkle || fluttershy</code> were evaluated first, the result set would differ. </p> <h4 id="parentheses">Defining Subexpressions with Parentheses</h4> <p> Returning to an earlier example with arithmetic, we can trump the order of operations using explicit subexpressions. This requires the use of <em>delimiters</em> that act as boundaries, and most often parentheses are used for this purpose. Hence, <i>A</i> &times; (<i>B</i> &plus; <i>C</i>) forces <i>B</i> &plus; <i>C</i> to be evaluated, and then multiplied with <i>A</i>, which is contrary to the order otherwise followed. Likewise, <code><a href="/search?q=%28twilight+sparkle+%7c%7c+fluttershy%29+%26%26+pinkie+pie">(twilight sparkle || fluttershy) && pinkie pie</a> </code> instructs the search engine to return results that have <em>either</em> <code>twilight sparkle</code> <em> or</em> <code>fluttershy</code> <em>and always match</em> the tag <code>pinkie pie</code>. </p> <p> As was mentioned earlier, the unary NOT operator can be applied to parenthesized subexpressions. The semantics of this is analogous to applying it to a single term: a negated subexpression specifies uploads that <em>do not</em> adhere to what the subexpression specifies. For example, the query <code><a href="/search?q=%2d%28pinkamena%2c+grimdark%29">-(pinkamena diane pie, grimdark)</a> </code> returns all uploads that are <em>not</em> tagged with <em>both</em> <code>pinkamena diane pie</code> <em>and</em> <code>grimdark</code>. Uploads tagged with <em>either</em> of the two would be returned as long as they do not have both. Thus light-hearted Pinkamena images and grimdark material not involving Pinkamena would be included, yet the intersection of those two sets of images would be excluded, that is, images that are grimdark and feature Pinkamena. </p> <p> Explicit subexpressions with parentheses allow for complex queries as they can be arbitrarily nested inside other subexpressions, to fine-tune the result set even more. </p> <h4 id="auto-escaping">Automatic Parentheses Escaping</h4> <p> Finally, a footnote about paretheses is warranted. Traditionally, if an expression parser encounters an open parenthesis without a closing parenthesis, or if parentheses are swapped, an error is raised. This is indeed the case with the search engine, as highlighted in the search parsing error page. However, to a limited extent, a term can contain parentheses within. Parentheses are accepted within search terms as long as they are closed and do not cover the entire expression. The first limit is a heuristic to address the typical use of parentheses, and the latter arises from the legal use of parentheses to single out a term. Thus, the search <code><a href="/search?q=rose+%28flower%29">rose (flower)</a> </code> searches for uploads tagged with <code>rose (flower)</code> ; however, the emoticon query <code><a href="/search?q=%29%29B-%28">))B-(</a> </code> raises an error, while <code><a href="/search?q=%28q%29">(q)</a> </code> effectively searches for <code>q</code>, instead. For the latter two examples, simply surround with double quotes to clarify your meaning to the search engine. </p> <h2 id="boosting">Boosting Terms</h2> <p> The search engine also allows the boosting of specific terms when sorting by relevance, so that uploads including or not including the term occur earlier or later in the results. Boosting is done by modifying a term's relevance score with a positive or negative value. This value is affixed to a term with a preceding caret (<code>^</code>) and with a positive or negative decimal number. For example, <code><a href="/search?q=pinkie+pie%5e1+%7c%7c+tara+strong&sf=relevance&sd=desc">pinkie pie^1 || tara strong</a> </code> returns uploads tagged either with <code>pinkie pie</code> or <code>tara strong</code>, but when sorting by relevance descending, uploads with <code>pinkie pie</code> are prioritized. A negative value meanwhile reduces the relevance score and deprioritizes the affected term when sorting by relevance, e.g., <code><a href="/search?q=pinkie+pie%5e1+%7c%7c+tara+strong&sf=relevance&sd=desc">pinkie pie^-1 || tara strong</a></code>. Sorting options are found below the search box on this page and <strong>must be set to sort by relevance</strong> for boosting to take proper effect. Thus, in both cases, pictures with <em>both</em> tags will still appear first. </p>
Sonata Dusk<p><i>(Note: Directly kipped from Derpibooru for now, updating examples and making the text easier to read comes later. - Sonata, May 20th 2020) <p><i>(Note: Directly kipped from Derpibooru for now, updating examples and making the text easier to read comes later. - Sonata, May 20th 2020)</i> <h2 id="about">About Search Syntax</h2> <p> The search engine allows users to locate uploads via tags that are associated with an upload and some other metadata such as uploader and user count. It also permits chaining together specified tags and metadata to search for specific logical combinations, to allow more precise filtering. This guide explains the syntax and features of individual terms and then shows how they are combined into more complex queries. </p> <ol> <li><a href="#about">About Search Syntax</a> </li> <li> <a href="#terms">Search Terms</a> <ol> <li><a href="#tag-behavior">Tag Search Behavior</a> </li> <li><a href="#fields">Searching Through Other Fields</a> </li> <li><a href="#numeric-range">Numeric Range Queries</a> </li> <li><a href="#date-range">Date/Time Range Queries</a> </li> <li><a href="#supported-fields">Supported Fields</a> </li> </ol> </li> <li> <a href="#special-characters">Special Characters and Suffixes</a> <ol> <li><a href="#wildcards">Wildcards</a> </li> <li><a href="#escaping">Escaping Special Characters</a> </li> <li><a href="#fuzzy">Approximate (Fuzzy) String Matching</a> </li> </ol> </li> <li> <a href="#grammar">Search Grammar: Term Operators and Combinations</a> <ol> <li><a href="#expressions">Expressions</a> </li> <li><a href="#expressions-summary">Summary Table</a> </li> <li><a href="#negation">Negation</a> </li> <li><a href="#and-expressions">Commas and AND Expressions</a> </li> <li><a href="#or-expressions">OR Expressions</a> </li> <li> <a href="#compound">Compound Expressions</a> <ol> <li><a href="#precedence">Operator Precedence</a> </li> <li><a href="#parentheses">Defining Subexpressions with Parentheses</a> </li> <li><a href="#auto-escaping">Automatic Parentheses Escaping</a> </li> </ol> </li> </ol> </li> <li><a href="#boosting">Boosting Terms</a> </li> </ol> <h2 id="terms">Search Terms</h2> <p> Specific searches require the inclusion of search terms, which individually define the criteria expected of each upload result to be returned by the search engine. </p> <h3 id="tag-behavior">Tag Search Behavior</h3> <p> Searching a single term is obvious: merely type in the term you want. By default, the term you use will be searched among the indexed image tags and aliases. Thus, a search for <code><a href="/search?q=pinkie+pie">pinkie pie</a></code> would, as you may surmise, result in all appropriately tagged and indexed pictures of Pinkie Pie. Aliases are also indexed, so a search for the tag alias <code><a href="/search?q=ts">ts</a></code> is the same as one for <code><a href="/search?q=twilight+sparkle">twilight sparkle</a></code>. </p> <p> The default tag search has particular aspects associated with it for your convenience. For tag searches, case is insensitive. This means capitalization is irrelevant for queries. For example, the search queries <code><a href="/search?q=pinkie+pie">pinkie pie</a> </code> and <code><a href="/search?q=Pinkie+Pie">Pinkie Pie</a> </code> will share the same result set. </p> <h3 id="fields">Searching Through Other Fields</h3> <p> Other fields are also indexed, and you can search them using the namespace convention that is also used by tags. Namely, one enters the field name followed by a colon, and finally, the target value. For example, to search for images with a width of 1920, we would search within the <code>width</code> field and so construct the query <code><a href="/search?q=width%2a1920">width:1920</a></code>. If a tag with namespace were to share the namespace with a given field, it can still be queried via quoting or escaping. </p> <h3 id="numeric-range">Numeric Range Queries</h3> <p> Numeric fields in particular support queries for ranges of possible values. A qualifier can be added to the end of the field name with a single period to indicate desired results that are greater than or less than the supplied value; the value can be optionally included, too. To find images with a score greater than 100, we would enter <code><a href="/search?q=score.gt%2a100">score.gt:100</a></code>. For an inclusive search of scores greater than <em>or equal to</em> 100, we would instead enter <code><a href="/search?q=score.gt%2a100">score.gte:100</a></code>. The following table enumerates the supported qualifiers. </p> <table class="table"> <thead> <tr> <th>Qualifier</th> <th>Meaning</th> <th>Example</th> </tr> </thead> <tbody> <tr> <td> <code>gt</code> </td> <td> Values greater than specified, and not including the specified value </td> <td> <code><a href="/search?q=score.gt%3a100">score.gt:100</a> </code></td> </tr> <tr> <td> <code>gte</code> </td> <td> Values greater than or equal to specified </td> <td> <code><a href="/search?q=score.gte%3a100">score.gte:100</a> </code></td> </tr> <tr> <td> <code>lt</code> </td> <td> Values less than specified, and not including the specified value </td> <td> <code><a href="/search?q=width.lt%3a100">width.lt:100</a> </code></td> </tr> <tr> <td> <code>lte</code> </td> <td> Values less than or equal to specified </td> <td> <code><a href="/search?q=width.lte%3a100">width.lte:100</a> </code></td> </tr> </tbody> </table> <h3 id="date-range">Date/Time Range Queries</h3> <p> Date and time values are specified using a tweaked subset of the <a href="https://en.wikipedia.org/wiki/ISO_8601">ISO 8601 standard</a>. A full date is specified by four-digit year, followed by two-digt month and date, with each value delimited by a hyphen, i.e., "YYYY-mm-DD". Like in ISO 8601, one can specify just the month or even just the year, as long as the less precise information is included in left-to-right order without dangling hyphens. This is semantically interpreted as the range of the entire period (not just the first day of the month, etc.). For example, <code>2015-04</code> represents the entire month of April 2015. </p> <p> Given a full date, a specification for the time of day can be added. To do so, separate the time with a <code>T</code> or space, followed by the hours, minutes, and seconds, each specified ' with two digits and separated by a colon, i.e., "HH:MM:SS". The hours follow a 24-hour clock. As with date values, one may alternatively specify entire minutes and even hours by truncating the value without a dangling colons. The value <code>2014-04-20 16</code> represents the entire hour of 4 PM on 20 April 2014 (UTC). The entire first minute can be specified with <code>2014-04-20 16:00</code>. </p> <p> By default, time follows international UTC ("Zulu") time. (In terms of the ISO 8601 standard, a <code>Z</code> suffix is implied.) One may specify an offset for local time by affixing a plus or minus sign, followed by the offset hours as two digits, a colon, and the offset minutes (usually <code>00</code>), e.g., <code>-04:00</code> for US Eastern Daylight Time (EDT). Note that unlike ISO 8601, this can be attached to dates as well as times, to ensure date boundaries fit the locale of interest. For example, <code>2015-05:00</code> represents the year of 2015 with an offset of minus five hours (US Eastern Standard Time). </p> <p> Date/time range queries also accept range qualifiers. The <code>gt</code> and <code>lt</code> qualifiers omit everything matching the implied time range of the specified value, whereas <code>gte</code> and <code>lte</code> include the entirety of said time range. </p> <p> The following examples are valid search queries. </p> <table class="table"> <thead> <tr> <th>Example</th> <th>Explanation</th> </tr> </thead> <tbody> <tr> <td> <code><a href="/search?q=created_at%3a2015">created_at:2015</a> </code></td> <td>Returns all uploads made in 2015 (UTC).</td> </tr> <tr> <td> <code><a href="/search?q=created_at%3a2015%2b08:00">created_at:2015+08:00</a> </code></td> <td>Returns all uploads made in 2015 (SGT).</td> </tr> <tr> <td> <code><a href="/search?q=created_at%3a2015-04">created_at:2015-04</a> </code></td> <td>Returns all uploads made in April 2015 (UTC).</td> </tr> <tr> <td> <code><a href="/search?q=created_at%3a2015-04-03%3a00">created_at:2015-04-03:00</a> </code></td> <td>Returns all uploads made in April 2015 (BRT).</td> </tr> <tr> <td> <code><a href="/search?q=created_at%3a2015-04-01">created_at:2015-04-01</a> </code></td> <td>Returns all uploads made in 1 April 2015 (UTC).</td> </tr> <tr> <td> <code><a href="/search?q=created_at%3a2015-04-01%2b08%3a00">created_at:2015-04-01+08:00</a> </code></td> <td>Returns all uploads made in 1 April 2015 (SGT).</td> </tr> <tr> <td> <code><a href="/search?q=created_at%3a2015-04-01+01">created_at:2015-04-01 01</a> </code></td> <td>Returns all uploads made in the hour of 1 AM of 1 April 2015 (UTC).</td> </tr> <tr> <td> <code><a href="/search?q=created_at%3a2015-04-01+01Z">created_at:2015-04-01 01Z</a> </code></td> <td> Returns all uploads made in the hour of 1 AM on 1 April 2015 (UTC). The zero UTC offset designator ("Zulu") is explicit. </td> </tr> <tr> <td> <code><a href="/search?q=created_at%3a2015-04-01T01Z">created_at:2015-04-01T01Z</a> </code></td> <td> Returns all uploads made in the hour of 1 AM on 1 April 2015 (UTC). This uses the standard "T" separator associated with ISO 8601. </td> </tr> <tr> <td> <code><a href="/search?q=created_at%3a2015-04-01+01-04%3a00">created_at:2015-04-01 01-04:00</a> </code></td> <td>Returns all uploads made in the hour of 1 AM on 1 April 2015 (EDS).</td> </tr> <tr> <td> <code><a href="/search?q=created_at%3a2015-04-01+01%3a00">created_at:2015-04-01 01:00</a> </code></td> <td>Returns all uploads made sometime in the minute of 1:00 AM on 1 April 2015 (UTC).</td> </tr> <tr> <td> <code><a href="/search?q=created_at%3a2015-04-01+01%3a00Z">created_at:2015-04-01 01:00Z</a> </code></td> <td> Returns all uploads made sometime in the minute of 1:00 AM on 1 April 2015 (UTC). The zero UTC offset designator ("Zulu") is explicit. </td> </tr> <tr> <td> <code><a href="/search?q=created_at%3a2015-04-01+00%3a00%3a00">created_at:2015-04-01 00:00:00</a> </code> <td>Returns all uploads made exactly at midnight on 1 April 2015 (UTC).</td> </td> </tr> <tr> <td> <code><a href="/search?q=created_at%3a2015-04-01+00%3a00%3a00%2b08%3a00">created_at:2015-04-01 00:00:00+08:00</a> </code></td> <td>Returns all uploads made exactly at midnight on 1 April 2015 (SGT).</td> </tr> <tr> <td> <code><a href="/search?q=created_at.lt%3a2015">created_at.lt:2015</a> </code></td> <td>Returns all uploads before the start of 2015 (UTC).</td> </tr> <tr> <td> <code><a href="/search?q=created_at.gte%3a2015-04-04">created_at.gte:2015-04-04</a> </code></td> <td>Returns all uploads since and including the entire day of 4 April 2015 (season 5 premiere, UTC).</td> </tr> </tbody> </table> <h3 id="supported-fields">Supported Fields</h3> <p> The following table enumerates all of the supported fields, with examples. </p> <table class="table"> <thead> <tr> <th>Field Selector</th> <th>Type</th> <th>Description</th> <th>Example</th> </tr> </thead> <tbody> <tr> <td> <code>aspect_ratio</code> </td> <td>Numeric Range</td> <td>Matches any image with the specified aspect ratio.</td> <td> <code><a href="/search?q=aspect_ratio%3a1">aspect_ratio:1</a> </code></td> </tr> <tr> <td> <code>comment_count</code> </td> <td>Numeric Range</td> <td>Matches any image with the specified number of comments</td> <td> <code><a href="/search?q=comment_count.gt%3a50">comment_count.gt:50</a> </code></td> </tr> <tr> <td> <code>created_at</code> </td> <td>Date/Time Range</td> <td>Matches any image posted at the specified date and/or time.</td> <td> <code><a href="/search?q=created_at%3a2015-04-01">created_at:2015-04-01</a> </code></td> </tr> <tr> <td> <code>description</code> </td> <td>Full Text</td> <td> Full-text search against image descriptions with the specified string. </td> <td> <code><a href="/search?q=description%3aderp">description:derp</a> </code></td> </tr> <tr> <td> <code>downvotes</code> </td> <td>Numeric Range</td> <td>Matches any image with the specified downvote count.</td> <td> <code><a href="/search?q=downvotes%3a0">downvotes:0</a> </code></td> </tr> <tr> <td> <code>faved_by</code> </td> <td>Literal</td> <td>Matches any image favorited by the specified user. Case-insensitive.</td> <td> <code><a href="/search?q=faved_by%3aroboshi">faved_by:roboshi</a> </code></td> </tr> <tr> <td> <code>faves</code> </td> <td>Numeric Range</td> <td>Matches any image with the specified number of favorites.</td> <td> <code><a href="/search?q=faves%3a20">faves:20</a> </code></td> </tr> <tr> <td> <code>height</code> </td> <td>Numeric Range</td> <td>Matches any image with the specified height.</td> <td> <code><a href="/search?q=height%3a1080">height:1080</a> </code></td> </tr> <tr> <td> <code>id</code> </td> <td>Numeric Range</td> <td>Matches any image with the specified number.</td> <td> <code><a href="/search?q=id%3a111111">id:111111</a> </code></td> </tr> <tr> <td> <code>orig_sha512_hash</code> </td> <td>Literal</td> <td> Matches the <em>original</em> SHA-512 checksum of an uploaded image. </td> <td> <code><a href=""></a> </code></td> </tr> <tr> <td> <code>score</code> </td> <td>Numeric Range</td> <td>Matches any image with the specified net score.</td> <td> <code><a href="/search?q=score.gt%3a200">score.gt:200</a> </code></td> </tr> <tr> <td> <code>sha512_hash</code> </td> <td>Literal</td> <td> Matches any image with the specified SHA-512 checkusm. N.B.: Image optimization usually alters the original checksum! </td> <td> <code><a href=""></a> </code></td> </tr> <tr> <td> <code>source_url</code> </td> <td>Literal</td> <td> Matches image source URLs. Case-insensitive. </td> <td> <code><a href="/search?q=source_url%3a%2adeviantart.com%2a">source_url:*deviantart.com*</a> </code></td> </tr> <tr> <td> <code>tag_count</code> </td> <td>Numeric Range</td> <td>Matches any image with the specified number of tags</td> <td> <code><a href="/search?q=tag_count.gt%3a10">tag_count.gt:10</a> </code></td> </tr> <tr> <td> <code>uploader</code> </td> <td>Literal</td> <td>Matches any image with the specified uploader account. Case-insensitive.</td> <td> <code><a href="/search?q=uploader%3ak_a">uploader:k_a</a> </code></td> </tr> <tr> <td> <code>upvotes</code> </td> <td>Numeric Range</td> <td>Matches any image with the specified upvote count.</td> <td> <code><a href="/search?q=upvotes.gt%3a200">upvotes.gt:200</a> </code></td> </tr> <tr> <td> <code>width</code> </td> <td>Numeric Range</td> <td>Matches any image with the specified width.</td> <td> <code><a href="/search?q=width%3a1920">width:1920</a> </code></td> </tr> <tr> <td> <code>wilson_score</code> </td> <td>Numeric Range</td> <td>Matches any image with the specified lower bound of a 99.5% Wilson CI.</td> <td> <code><a href="/search?q=wilson_score.gt%3a0.9">wilson_score.gt:0.9</a> </code></td> </tr> </tbody> </table> <p> It is worth noting the absence of certain &ldquo;fields&rdquo; such as <code>artist</code> and <code>spoiler</code>. These are <em>tag namespaces,</em> not metadata, but they are functionally the same. Thus, a search for <code><a href="/search?q=spoiler%3as04">spoiler:s04</a> </code> performs as expected. </p> <h2 id="special-charcters">Special Characters and Suffixes</h2> <h3 id="wildcards">Wildcards</h3> <p> Wildcards allow for matching with terms that begin with, end with, or contain a given string of characters, like wildcards used in file management. Two wildcards are recognized: the asterisk (or star) and the question mark. </p> <p> An asterisk "expands" or matches to any number of characters in its place, including 0. For example, <code><a href="/search?q=apple%2a">apple*</a> </code> matches to uploads with any of the tags <code>apple bloom</code>, <code>applejack</code>, and simply <code>apple</code>. </p> <p> A question mark matches to a single character in its place. For example, <code><a href="/search?q=t%3fixie">t?ixie</a> </code> can match to either <code>trixie</code> or <code>twixie</code>. </p> <table class="table"> <thead> <tr> <th>Wildcard Character</th> <th>Match</th> </tr> </thead> <tbody> <tr> <td> <code>*</code> </td> <td>Zero or more characters</td> </tr> <tr> <td> <code>?</code> </td> <td>A single character</td> </tr> </tbody> </table> <h3 id="escaping">Escaping Special Characters</h3> <p> The use of special characters that modify search terms or exist outside search terms mandates a facility for &ldquo;escaping&rdquo; those characters, so that they are not excluded from search terms themselves. To use special characters within a search term, both of the conventional string escaping mechanisms are used: the backslash and quoting. The following are special characters and sequences that may need to be escaped: </p> <ul> <li> <code>( </code></li> <li> <code>) </code></li> <li> <code>*</code> </li> <li> <code>?</code> </li> <li> <code>-</code> (when placed in front of a term) </li> <li> <code>!</code> (when placed in front of a term) </li> <li> <code>,</code> </li> <li> <code>&&</code> </li> <li> <code>||</code> </li> <li> <code>OR</code> (if all-capitalized) </li> <li> <code>AND</code> (if all-capitalized) </li> <li> <code>NOT</code> (if all-capitalized) </li> <li> <code>"</code> </li> <li> <code>\</code> </li> <li> <code>~</code> (with <a href="#fuzzy">fuzzy matching syntax</a>) </li> <li> <code>^</code> (with <a href="#boosting">boosting</a>) </li> </ul> <p> A backslash is placed in front of a special character (and can also be placed in front of a sequence like the ones in the preceding list). This forces a given character to be counted as part of the preceding or following term. In front of any other character, it effectively has no effect. For example, <code><a href="/search?q=%5c%2d_%2d">\-_-</a> </code> forces a search for the emoticon <code>-_-</code>, despite it following the syntax for <code><a href="#negation">negation</a> </code> if without the backslash. Also consider the search term <code><a href="/search?q=rose+%5c%28flower%5c%29">rose \(flower\)</a></code>, although parentheses have intuitive rules that do not make escaping them necessary in most cases. The backslash is a special character and thus must also be escaped; a literal backslash is indicated with <code>\\</code>. </p> <p> The alternative to escaping is to simply surround the search query in double quotes (<code>"</code>), e.g., <code><a href="/search?q=%22rose+%28flower%29%22">"rose (flower)"</a></code>. When searching with a specified field, quotes <strong>must surround the field and colon as well</strong>, e.g., <code><a href="/search?q=%22faved_by%3ak_a%22">"width:1920"</a></code>. Eveything in quotes is together treated as a verbatim search term, with one exception. Note that the double quote character itself bounds the search term, so if it appears inside, it must be escaped with a backslash. <strong>All other uses of backslash are treated literally.</strong> </p> <h3 id="fuzzy">Approximate (Fuzzy) String Matching</h3> <p> The search engine backend, Apache Lucene, also enables so-called &ldquo;fuzzy&rdquo; string matching. Fuzzy string matching can be used with any literal search term, including the default tags field. A fuzzy match is specified using a similarity metric either ranging from 0 to 1.0 or a whole number. The whole number specifies an <i>optimal string alignment edit distance,</i> which is the maximum number of edits done to a string to match a given target, with an edit defined as a deletion, insertion, replacement, or switching two adjacent characters. One may alternatively define a similarity factor ranging from 0 to 1.0, with a 1.0 the least &ldquo;fuzzy&rdquo;. The derived edit distance is the length of the term sans the field name prefix, multiplied by the difference of unity minus the similarity factor, all rounded down. To specify either, a term is followed with a tilde followed by the edit distance or similarity factor. <strong> Note in both cases that Lucene caps the maximum edit distance at 2, as an optimization. Therefore, very large edit distances or small similarities will not behave as expected. </strong></p> <p> For example, <code><a href="/search?q=fluttersho%7e1.0">fluttersho~0.8</a> </code> searches for uploads with tags that approximately match <code>fluttersho</code>, with a similairty of 0.8. This is an edit distance of &lfloor;(1 &minus; 0.8)(10)&rfloor; = 2. Note that uploads tagged <code>fluttershy</code> are included in the result set. The utility of this is obvious: if you are unsure of a character or tag's exact spelling, you can use this as an aid, like a more manual and controlled version of Google's (in)famous spelling correction features. </p> <p> Fuzziness can also be applied to numeric queries to specify a range. In this case, the fuzziness parameter is the magnitude above and below the specified number that will be included in the result set. For example, <code><a href="/search?q=width%3a800%7e200">width:800~200</a> </code> specifies images with a width ranging from 600 (800 &minus; 200) to 1000 (800 &plus; 200), inclusive. </p> <p> Fuzzy matching can be freely applied to any term inside an <code><a href="#expressions">expression</a></code>. </p> <h2 id="grammar">Search Grammar: Term Operators and Combinations</h2> <h3 id="expressions">Expressions</h3> <p> Terms can be combined to define a search query corresponding to a specific result set. These combinations are formulized as <b>expressions</b> that are constructed from terms, operators, and even other expressions, which are then called <strong>subexpressions</strong>. Expressions recognized by the search frontend are the negation of a term or subexpression, the requirement of any search term or subexpression, or the requirement of both search terms or subexpressions. </p> <p> At its core, a search expression is either <strong>binary</strong> or <strong>unary</strong>. A binary expression consists of a term or subexpression, an <strong>operator</strong> indicating the type of expression, and another term or subexpression. Binary expressions can be &ldquo;chained&rdquo; by adding the operator followed by another term. A unary expression consists of the operator followed by a single term or subexpression. Both expression types and how to use subexpressions will be covered in the following sections. </p> <h3 id="expressions-summary">Summary Table</h3> <table class="table"> <thead> <tr> <th>Operator</th> <th>Symbols</th> <th>Comments</th> </tr> </thead> <tbody> <tr> <td>Negation (NOT)</td> <td> <ul> <li> <code>NOT</code> </li> <li> <code>-</code> </li> <li> <code>!</code> </li> </ul> </td> <td> Applied in front of a single term or parenthesized subexpression. The minus sign does not require padding to the right. Specifies that the term or subexpression <em>must not</em> match. </td> </tr> <tr> <td>Conjunction (AND)</td> <td> <ul> <li> <code>,</code> </li> <li> <code>&&</code> </li> <li> <code>AND</code> </li> </ul> </td> <td> Applied between two terms. The comma may be optionally padded with space on either side; the other forms must be padded. Specifies that both terms match. Can be chained to more terms. </td> </tr> <tr> <td>Disjunction (OR)</td> <td> <ul> <li> <code>||</code> </li> <li> <code>OR</code> </li> </ul> </td> <td> Applied between two terms, with surrounding space. Specifies that either of the terms match. Can be chained to more terms. </td> </tr> </tbody> </table> <h3 id="negation">Negation</h3> <p> <b>Negation</b> of a term or expression specifies that the the original term or subexpression <em>must not</em> match. The corresponding negation operator is <b>unary</b>, that is, applied to either a single term or to a subexpression. It is specified with the all-capitalized word <code>NOT</code>, a dash of the non-multi-chromatic variety (<code>-</code>), or an exclamation point (<code>!</code>). For example, <code><a href="/search?q=%2dfluttershy">-fluttershy</a></code> or <code><a href="/search?q=NOT+fluttershy">NOT fluttershy</a></code> matches pictures that are <em>not</em> tagged with <code>fluttershy</code>. In set theory terms, this is taking the <em>complement</em> of the original result set, that is, all uploads outside it. </p> <h3 id="and-expressions">Commas and AND Expressions</h3> <p> An expression that queries for images that meet <em>all</em> specified terms is a <b>conjunction</b> or <b>AND expresssion</b>. As in the past, you can query images that meet a list of terms by hooking the terms together with commas. For example, <code><a href="/search?q=fluttershy%2cpinkie+pie">fluttershy,pinkie pie</a> </code> results in pictures that contain <em>both</em> the <code>fluttersy</code> and <code>pinkie pie</code> tags. In set theory terms, the result set is the intersection of uploads tagged <code>fluttershy</code> and uploads tagged <code>pinkie pie</code>. </p> <p> Commas can be padded with spaces however you like. Unlike the past, commas are now plain AND operators, so they are more versatile. As will be discussed, they can be used in subexpressions and alongside the OR operator. </p> <p> AND operators can also be expressed using <code>&&</code> (derived from typical programming notation) or the all-capitalized word <code>AND</code>, e.g., <code><a href="/search?q=rarity+%26%26+pinkie+pie">rarity && pinkie pie</a></code> or <code><a href="/search?q=rarity+AND+pinkie+pie">rarity AND pinkie pie</a></code>. These forms, unlike the comma, require padding space on either side. </p> <h3 id="or-expressions">OR Expressions</h3> <p> A <b>disjunction</b> or <b>OR expression</b> requests for uploads that meet <em>any</em> of the specified search terms. This is markedly different from the aforementioned AND expression, which, to reiterate, mandates that <em>all</em> terms match. OR operators are expressed either with <code>||</code> (also a programming notation) or the all-capitalized word <code>OR</code>, e.g., <code><a href="/search?q=rarity+%7c%7c+pinkie+pie">rarity || pinkie pie</a></code> or <code><a href="/search?q=rarity+OR+pinkie+pie">rarity OR pinkie pie</a></code>. In set theory terms, the result set is the union of uploads tagged <code>rarity</code> and uploads tagged <code>pinkie pie</code>. All forms of the OR operator require padding on either side. </p> <h3 id="compound">Compound Expressions</h3> <p> Complex combinations of terms, and therefore search criteria, are possible by combining expressions together. Doing so effectively is analogous to arithmetic. Consider multiplication and addition (which in so-called <em>Boolean alegra</em> are respectively analogous to AND and OR operations). We can express an algebraic expression with multiplication and addition several ways. For three terms, <i>A</i>, <i>B</i>, and <i>C</i>, consider the expression <i>A</i> &times; <i>B</i> &plus; <i>C</i>. Multiplication is evaluated before addition, so this expression is equivalent to (<i>A</i> &times; <i>B</i>) &plus; <i>C,</i> in which case the order of operations is explicit. </p> <h4 id="precedence">Operator Precedence</h4> <p> Likewise, precedence is applied to determine the order in which chained OR, AND, and NOT operations are evaluated. The order of operations in the search syntax is as follows: </p> <ol> <li>negation (NOT)</li> <li>conjunction (AND)</li> <li>disjunction (OR)</li> </ol> <p> Consider the query <code><a href="/search?q=twilight+sparkle+%7c%7c+fluttershy+%26%26+pinkie+pie">twilight sparkle || fluttershy && pinkie pie</a></code>. In this example, <code>fluttershy && pinkie pie</code> is evaluated first, as an implicit <em>subexpression.</em> Then, that result is OR'd together with <code>twilight sparkle</code>. Thus, the query instructs the engine to return uploads <em>either</em> tagged with <code>twilight sparkle</code> <em>or</em> tagged with <em>both</em> <code>fluttershy</code> <em>and</em> <code>pinkie pie</code>. Note how if the OR expression <code>twilight sparkle || fluttershy</code> were evaluated first, the result set would differ. </p> <h4 id="parentheses">Defining Subexpressions with Parentheses</h4> <p> Returning to an earlier example with arithmetic, we can trump the order of operations using explicit subexpressions. This requires the use of <em>delimiters</em> that act as boundaries, and most often parentheses are used for this purpose. Hence, <i>A</i> &times; (<i>B</i> &plus; <i>C</i>) forces <i>B</i> &plus; <i>C</i> to be evaluated, and then multiplied with <i>A</i>, which is contrary to the order otherwise followed. Likewise, <code><a href="/search?q=%28twilight+sparkle+%7c%7c+fluttershy%29+%26%26+pinkie+pie">(twilight sparkle || fluttershy) && pinkie pie</a> </code> instructs the search engine to return results that have <em>either</em> <code>twilight sparkle</code> <em> or</em> <code>fluttershy</code> <em>and always match</em> the tag <code>pinkie pie</code>. </p> <p> As was mentioned earlier, the unary NOT operator can be applied to parenthesized subexpressions. The semantics of this is analogous to applying it to a single term: a negated subexpression specifies uploads that <em>do not</em> adhere to what the subexpression specifies. For example, the query <code><a href="/search?q=%2d%28pinkamena%2c+grimdark%29">-(pinkamena diane pie, grimdark)</a> </code> returns all uploads that are <em>not</em> tagged with <em>both</em> <code>pinkamena diane pie</code> <em>and</em> <code>grimdark</code>. Uploads tagged with <em>either</em> of the two would be returned as long as they do not have both. Thus light-hearted Pinkamena images and grimdark material not involving Pinkamena would be included, yet the intersection of those two sets of images would be excluded, that is, images that are grimdark and feature Pinkamena. </p> <p> Explicit subexpressions with parentheses allow for complex queries as they can be arbitrarily nested inside other subexpressions, to fine-tune the result set even more. </p> <h4 id="auto-escaping">Automatic Parentheses Escaping</h4> <p> Finally, a footnote about paretheses is warranted. Traditionally, if an expression parser encounters an open parenthesis without a closing parenthesis, or if parentheses are swapped, an error is raised. This is indeed the case with the search engine, as highlighted in the search parsing error page. However, to a limited extent, a term can contain parentheses within. Parentheses are accepted within search terms as long as they are closed and do not cover the entire expression. The first limit is a heuristic to address the typical use of parentheses, and the latter arises from the legal use of parentheses to single out a term. Thus, the search <code><a href="/search?q=rose+%28flower%29">rose (flower)</a> </code> searches for uploads tagged with <code>rose (flower)</code> ; however, the emoticon query <code><a href="/search?q=%29%29B-%28">))B-(</a> </code> raises an error, while <code><a href="/search?q=%28q%29">(q)</a> </code> effectively searches for <code>q</code>, instead. For the latter two examples, simply surround with double quotes to clarify your meaning to the search engine. </p> <h2 id="boosting">Boosting Terms</h2> <p> The search engine also allows the boosting of specific terms when sorting by relevance, so that uploads including or not including the term occur earlier or later in the results. Boosting is done by modifying a term's relevance score with a positive or negative value. This value is affixed to a term with a preceding caret (<code>^</code>) and with a positive or negative decimal number. For example, <code><a href="/search?q=pinkie+pie%5e1+%7c%7c+tara+strong&sf=relevance&sd=desc">pinkie pie^1 || tara strong</a> </code> returns uploads tagged either with <code>pinkie pie</code> or <code>tara strong</code>, but when sorting by relevance descending, uploads with <code>pinkie pie</code> are prioritized. A negative value meanwhile reduces the relevance score and deprioritizes the affected term when sorting by relevance, e.g., <code><a href="/search?q=pinkie+pie%5e1+%7c%7c+tara+strong&sf=relevance&sd=desc">pinkie pie^-1 || tara strong</a></code>. Sorting options are found below the search box on this page and <strong>must be set to sort by relevance</strong> for boosting to take proper effect. Thus, in both cases, pictures with <em>both</em> tags will still appear first. </p>