XPath, the Expression Language
XPath is usually associated with XML files because it was originally designed for XML and its many derivatives, such as HTML and XHTML.
XPath is an expression language that defines how to access an element in your file.
Quick example:
- Consider the following structured XML code:
<root>
<child1 />
<child2 />
</root>
- To access
child1
andchild2
, you would use:/root/child1
/root/child2
The structure is straightforward:
/root
represents the<root>
tag./child1
represents the<child1>
tag./root/child1
represents the<child1>
tag as a child of the<root>
tag.
XML XPath is a powerful language capable of describing any element route in your codebase.
CSS Selectors
As you probably know, CSS is composed of selectors and properties:
tag.class#id {
color: brown;
}
An equivalent XPath expression could be //tag[@class='class' and @id='id']
. It may look longer, but it’s still manageable.
CSS, as a separate language, has its own expression path system for evaluating elements in a document. This is known as the CSS Selector.
The reason for its existence isn’t as important as recognizing that it effectively separates languages and technologies. Over time, CSS has gained prefixes, variables, references, functions, inheritance, and more. XPath-like capabilities in CSS selectors have helped to achieve this functionality.
JSON Path
Consider the following JSON data:
{
"store": {
"book": [
{
"author": "Nigel Rees",
"title": "Sayings of the Century",
"price": 8.95
},
{
"author": "J. R. R. Tolkien",
"title": "The Lord of the Rings",
"isbn": "0-395-19395-8",
"price": 22.99
}
]
}
}
To access the first book, “Sayings of the Century”, we use:
$.store.book[0]
This expression retrieves the entire object:
{
"author": "Nigel Rees",
"title": "Sayings of the Century",
"price": 8.95
}
It resembles an XML XPath query like //store/book[0]
.
SQL
XPath-like structures are not exclusive to XML and CSS. SQL also has its own way of expressing paths:
SELECT table1.column1, table2.column1
FROM schema1.table1, schema2.table2
WHERE table1.column1 = 'value';
Here, schema.table
and table.column
define paths to data. A similar XPath-like structure for SQL could be:
//schema/table/column
More XPath
Look at your browser’s URL. It’s a structured path!
https://xepozz.github.io/meta-storm-idea-plugin/xpath/
Even domain zones represent a hierarchical namespace of possible domains worldwide. Originally, they corresponded to country regions.
Similarly, an IP address describes a location in the network.
Have you ever dialed a phone number? Your cellular XPath consists of:
- A
+
followed by the country code. - A regional number.
- A client number.
Even in a Unix terminal, when you type pwd
, the output looks like an XPath:
/Users/xepozz
XPath-like structures are everywhere!
PHP XPath?
What about PHP? Like Java, Python, or JavaScript, PHP has its own package addressing system:
App\Controller
in PHP.app.controller
in Java or Python.app/controller
in JavaScript.
I won’t go into the differences between namespaces and packages here, as they are irrelevant to XPath.
When copying the full path of an entity in an IDE, you might see:
- Functions:
App\Functions\function1
- Classes:
App\Controller\UserController
- Methods:
App\Controller\UserController::actionIndex
- Properties:
App\Controller\UserController::$property
- Constants:
App\Controller\UserController::CONSTANT
These follow a recognizable pattern:
{namespace}\{entity name}::{inner entity name}
Here, {inner entity name}
refers to:
$name
for properties.name
for methods and constants.
PHP has no official standardized XPath, but various approaches exist. Any way you reference an entity is essentially a custom path.
MetaStorm
MetaStorm introduces its own PHP XPath. This was necessary because frameworks often reference some magical points, e.g.:
All properties
of aclass
are determined by the$model
property type in$currentFile
.All methods
of aclass
are inferred from thereturn type
ofgetModelClass()
.- etc.
For named entities, like namespaces, functions, classes, and variables, the path follows the Fully Qualified Name (FQN).
However, anonymous or unnamed entities do not fit within this system.
By anonymous entities I mean:
['class' => '\Namespace\MyClass', 'property' => '1']
in context of a method call, such as '\Namespace\MyClass'
in the following:
$object->method('\Namespace\MyClass');
Currently, MetaStorm is able to jump between code entities using current code Xpath rules:
- XPath must be a FQCN or started with a relative point:
$containingClass
– means the class in which context the target is placed$argument
– related to the positioned argument: it’s$obj
in$service->method($obj, $prop)
$variable
– related to the variable holding the method: it’s$service
in$service->method()
- After you specified starting point you may walk further:
- Each “entrance” must be a dot symbol:
.
- Each “property” must be prefixed with a dollar symbol:
$
- Each “method return type” must have double squares as a suffix :
()
- Each “entrance” must be a dot symbol:
- Lookup supports:
- Typed properties:
private Class $property
- Untyped properties with reference inside:
private string $propertyClass = Class::class
- Typed methods’ returning:
public function methodA(): Class
- Typed properties:
As long as jumping between the files using current file Xpath rules
- XPath must be started from a relative point:
$project
– project base path$file
– related to the current file$directory
– related to the current directory
PHP Schema
The example I’ve described with array-like configuration:
[
'class' => '\Namespace\MyClass',
'property' => '1',
]
is well described by PHP Schema. Unfortunately, PHP does not have a standardized way to define and validate such array structures.
I have initiated internal discussions about this idea. It will be a long journey, but if you are interested in contributing, let me know.
A draft will be available soon: https://github.com/php-schema/php-schema
PHP Xpath
As PHP Schema evolves, an advanced method for accessing code entities is PHP XPath.
PHP XPath can help describe possible values in PHP Schema, reference dependent code entities, and navigate schema nodes for tooling purposes.
Initially, it was developed to simplify writing MetaStorm configurations, but it has since grown into a standalone, complex project.
I hope this effort will succeed, providing the PHP community with powerful tools similar to those available for XML, JSON, YAML, and many other languages.