mac_alias - Generate/parse Mac OS Alias records from Python¶
This document refers to version 2.0.0
What is this?¶
The Mac OS has a special data structure it calls an Alias, which allows programs that make use of it to locate the file to which it refers more reliably than they would be able to from e.g. a filename alone.
The format of this structure is not documented, so until recently you would have used Mac OS X APIs to construct and process Alias records. Sadly, Apple has deprecated the APIs in question in favour of its new “Bookmark” functionality; this is understandable, but it makes it tricky to construct an Alias record reliably in future.
This module contains code to parse and generate Alias records from a Pythonic equivalent data structure, and does not rely on the deprecated APIs.
It also contains code to parse and generate Bookmark records, again from a Pythonic equivalent, and without relying on OS X APIs.
Usage¶
To parse an Alias record given binary data:
from mac_alias import Alias
a = Alias.from_bytes(my_data)
To generate a binary Alias record:
a.to_bytes()
Finally, to build an Alias
for a file:
Alias.for_file('/path/to/file.ext')
It’s probably best to resist the temptation to mess with the Alias class too much otherwise.
Similarly, to parse a Bookmark record given binary data:
from mac_alias import Bookmark
b = Bookmark.from_bytes(my_data)
To generate a binary Bookmark record:
b.to_bytes()
And to build a Bookmark
for a file:
Bookmark.for_file('/path/to/file.ext')
Code Documentation¶
Contents:
mac_alias package¶
Classes¶
-
class
mac_alias.
Alias
(appinfo='x00x00x00x00', version=2, volume=None, target=None, extra=[])¶ -
appinfo
= None¶ Application specific information (four byte byte-string)
-
extra
= None¶ A list of extra (tag, value) pairs
-
target
= None¶ A
TargetInfo
object describing the target
-
version
= None¶ Version (we support only version 2)
-
volume
= None¶ A
VolumeInfo
object describing the target’s volume
-
The AFP server
The username
The AppleShare zone
-
class
mac_alias.
TargetInfo
(kind, filename, folder_cnid, cnid, creation_date, creator_code, type_code, levels_from=-1, levels_to=-1, folder_name=None, cnid_path=None, carbon_path=None, posix_path=None, user_home_prefix_len=None)¶ -
carbon_path
= None¶ The Carbon path of the target (optional)
-
cnid
= None¶ The CNID (Catalog Node ID) of the target
-
cnid_path
= None¶ The path from the volume root as a sequence of CNIDs. (optional)
-
creation_date
= None¶ The target’s creation date.
-
creator_code
= None¶ The target’s Mac creator code (a four-character binary string)
-
filename
= None¶ The filename of the target
-
folder_cnid
= None¶ The CNID (Catalog Node ID) of the target’s containing folder; CNIDs are similar to but different than traditional UNIX inode numbers
-
folder_name
= None¶ The (POSIX) name of the target’s containing folder. (optional)
-
kind
= None¶ Either ALIAS_KIND_FILE or ALIAS_KIND_FOLDER
-
levels_from
= None¶ The depth of the alias? Always seems to be -1 on OS X.
-
levels_to
= None¶ The depth of the target? Always seems to be -1 on OS X.
-
posix_path
= None¶ The POSIX path of the target relative to the volume root. Note that this may or may not have a leading ‘/’ character, but it is always relative to the containing volume. (optional)
-
type_code
= None¶ The target’s Mac type code (a four-character binary string)
-
user_home_prefix_len
= None¶ If the path points into a user’s home folder, the number of folders deep that we go before we get to that home folder. (optional)
-
Constants¶
-
mac_alias.
ALIAS_HFS_VOLUME_SIGNATURE
¶ The volume signature for HFS+.
-
mac_alias.
ALIAS_FIXED_DISK
¶ -
mac_alias.
ALIAS_NETWORK_DISK
¶ -
mac_alias.
ALIAS_400KB_FLOPPY_DISK
¶ -
mac_alias.
ALIAS_800KB_FLOPPY_DISK
¶ -
mac_alias.
ALIAS_1_44MB_FLOPPY_DISK
¶ -
mac_alias.
ALIAS_EJECTABLE_DISK
¶ Disk type constants.
-
mac_alias.
ALIAS_NO_CNID
¶ A constant used where no CNID is present.
-
mac_alias.
kBookmarkPath
¶ -
mac_alias.
kBookmarkCNIDPath
¶ -
mac_alias.
kBookmarkFileProperties
¶ -
mac_alias.
kBookmarkFileName
¶ -
mac_alias.
kBookmarkFileID
¶ -
mac_alias.
kBookmarkFileCreationDate
¶ -
mac_alias.
kBookmarkTOCPath
¶ -
mac_alias.
kBookmarkVolumePath
¶ -
mac_alias.
kBookmarkVolumeURL
¶ -
mac_alias.
kBookmarkVolumeName
¶ -
mac_alias.
kBookmarkVolumeUUID
¶ -
mac_alias.
kBookmarkVolumeSize
¶ -
mac_alias.
kBookmarkVolumeCreationDate
¶ -
mac_alias.
kBookmarkVolumeProperties
¶ -
mac_alias.
kBookmarkContainingFolder
¶ -
mac_alias.
kBookmarkUserName
¶ -
mac_alias.
kBookmarkUID
¶ -
mac_alias.
kBookmarkWasFileReference
¶ -
mac_alias.
kBookmarkCreationOptions
¶ -
mac_alias.
kBookmarkURLLengths
¶ -
mac_alias.
kBookmarkSecurityExtension
¶ Bookmark data keys. A Bookmark holds a set of TOCs (Tables of Contents), each of which maps a set of keys to a set of values. The keys are either numeric, like the ones represented by the above constants, or strings.
Bookmarks can hold strings, byte data, numbers, dates, booleans, arrays, dicts, UUIDs, URLs and NULLs (represented by Python None). If you store data in a bookmark using the string key functionality, the documentation for CF/NSURL recommends using reverse DNS for the keys to avoid clashes.
Binary Formats¶
Mac Alias Format¶
Everything below is big-endian.
An Alias record starts as follows:
Offset | Size | Contents |
---|---|---|
0 | 4 | Application specific four-character code |
4 | 2 | Record size (must be >= 150 bytes) |
6 | 2 | Version (we support version 2) |
8 | 2 | Alias kind (0 = file, 1 = folder) |
10 | 28 | Volume name (Pascal-style string; first octet gives length) |
38 | 4 | Volume date (seconds since 1904-01-01 00:00:00 UTC) |
42 | 2 | Filesystem type (typically ‘H+’ for HFS+) |
44 | 2 | Disk type (0 = fixed, 1 = network, 2 = 400Kb, 3 = 800kb, 4 = 1.44MB, 5 = ejectable) |
46 | 4 | CNID of containing folder |
50 | 64 | Target name (Pascal-style string) |
114 | 4 | Target CNID |
118 | 4 | Target creation date (seconds since 1904-01-01 00:00:00 UTC) |
122 | 4 | Target creator code (four-character code) |
126 | 4 | Target type code (four-character code) |
130 | 2 | Number of directory levels from alias to root (or -1) |
132 | 2 | Number of directory levels from root to target (or -1) |
134 | 4 | Volume attributes |
138 | 2 | Volume filesystem ID |
140 | 10 | Reserved (set to zero) |
This record is optionally followed by tag-length-value data:
Offset | Size | Contents |
---|---|---|
0 | 2 | Tag |
2 | 2 | Length |
4 | Length | Value |
If the length is odd, a pad byte is added at the end.
Valid tags are:
Tag | Contents |
---|---|
-1 | Signifies the end of the alias record |
0 | Carbon folder name (a string) |
1 | CNID path (an array of CNIDs, one per directory) |
2 | Carbon path (a string) |
3 | AppleShare zone (a string) |
4 | AppleShare server name (a string) |
5 | AppleShare username (a string) |
6 | Driver name (a string) |
9 | Network mount information |
10 | Dial-up connection information |
14 | Unicode filename of target (a UTF-16 big endian string) |
15 | Unicode volume name (a UTF-16 big endian string) |
16 | High resolution volume creation date (65536ths of a second since 1904-01-01 00:00:00 UTC) |
17 | High resolution creation date (65536ths of a second since 1904-01-01 00:00:00 UTC) |
18 | POSIX path (a string) |
19 | POSIX path to volume mountpoint (a string) |
20 | Recursive alias of disk image (an alias record) |
21 | User home length prefix (two-byte integer, says how many directory levels to the user’s home folder) |
Mac Bookmark Format¶
Everything below is little-endian unless otherwise mentioned.
The Bookmark format is a more modern alternative to the alias record. Bookmarks consist of a set of dictionaries mapping keys to values; each dictionary has its own Table of Contents (TOC) structure.
The record starts with a header:
Offset | Size | Contents |
---|---|---|
0 | 4 | Magic number (‘book’) |
4 | 4 | Total size in bytes |
8 | 4 | Unknown (0x10040000) - might be a version? |
12 | 4 | Size of header (48) |
16 | 32 | Reserved |
All offsets stored in the file are relative to the end of this header.
This is immediately followed at location 48 by a 4-byte offset to the first TOC structure. It seems odd that this is not part of the header, but for some reason best known to the engineers at Apple, it isn’t.
A TOC starts with its own header:
Offset | Size | Contents |
---|---|---|
0 | 4 | Size of TOC in bytes, minus 8 |
4 | 4 | Magic number (0xfffffffe) |
8 | 4 | Identifier (just a number) |
12 | 4 | Next TOC offset (or 0 if none) |
16 | 4 | Number of entries in this TOC |
This is followed by an array of TOC entries. There is code that does a binary search of the TOC structure, so they must be stored in key order. A TOC entry looks like this:
Offset | Size | Contents |
---|---|---|
0 | 4 | Key |
4 | 4 | Offset to data record |
8 | 4 | Reserved (0) |
If the key has its top bit set (0x80000000), then (key & 0x7fffffff) gives the offset of a string record.
Each data record has the following fields:
Offset | Size | Contents |
---|---|---|
0 | 4 | Length of data (n) |
4 | 4 | Type |
8 | n | Data bytes |
Known data types are as follows:
Code | Type | Encoding |
---|---|---|
0x0101 | String | UTF-8 |
0x0201 | Data | Raw bytes |
0x0301 | Number (signed 8-bit) | 1-byte number |
0x0302 | Number (signed 16-bit) | 2-byte number |
0x0303 | Number (signed 32-bit) | 4-byte number |
0x0304 | Number (signed 64-bit) | 8-byte number |
0x0305 | Number (32-bit float) | IEEE single precision |
0x0306 | Number (64-bit float) | IEEE double precision |
0x0400 | Date | Big-endian IEEE double precision seconds since 2001-01-01 00:00:00 UTC |
0x0500 | Boolean (false) | No data |
0x0501 | Boolean (true) | No data |
0x0601 | Array | Array of 4-byte offsets to data items |
0x0701 | Dictionary | Array of pairs of 4-byte (key, value) data item offsets |
0x0801 | UUID | Raw bytes |
0x0901 | URL | UTF-8 string |
0x0902 | URL (relative) | 4-byte offset to base URL, 4-byte offset to UTF-8 string |
The first TOC in the file generally has its identifier set to 1. As mentioned, the keys in each TOC can be strings, in which case the key field will contain the offset to the string, or they can be certain special values. Currently known values are:
Key | Meaning | Value |
---|---|---|
0x1003 | Unknown | Unknown |
0x1004 | Target path | Array of individual path components |
0x1005 | Target CNID path | Array of CNIDs |
0x1010 | Target flags | Data - see below |
0x1020 | Target filename | String |
0x1030 | Target CNID | 4-byte integer |
0x1040 | Target creation date | Date |
0x1054 | Unknown | Unknown |
0x1055 | Unknown | Unknown |
0x1056 | Unknown | Unknown |
0x1101 | Unknown | Unknown |
0x1102 | Unknown | Unknown |
0x2000 | TOC path | Array - see below |
0x2002 | Volume path | Array of individual path components |
0x2005 | Volume URL | URL of volume root |
0x2010 | Volume name | String |
0x2011 | Volume UUID | String (not a UUID!) |
0x2012 | Volume size | 8-byte integer |
0x2013 | Volume creation date | Date |
0x2020 | Volume flags | Data - see below |
0x2030 | Volume is root | True if the volume was the filesystem root |
0x2040 | Volume bookmark | TOC identifier for disk image |
0x2050 | Volume mount point | URL |
0x2070 | Unknown | Unknown |
0xc001 | Containing folder index | Integer index of containing folder in target path array |
0xc011 | Creator username | Name of user that created bookmark |
0xc012 | Creator UID | UID of user that created bookmark |
0xd001 | File reference flag | True if creating URL was a file reference URL |
0xd010 | Creation options | Integer containing flags passed to CFURLCreateBookmarkData |
0xe003 | URL length array | Array of integers - see below |
0xf017 | Localized name? | String? |
0xf022 | Unknown | Unknown |
0xf080 | Security extension | Unknown but looks like a hash with data and an access right |
0xf081 | Unknown | Unknown |
The target flags (0x1010) are encoded as a Data object containing three 8-byte integers. The first contains flags describing the target; the second says which flags are valid, and the third appears to always be zero. Supported flags can be found in CFURLPriv.h, which is part of CF-Lite; for the target flags field, it’s the “resource property flags” that are valid.
Similarly the volume flags (0x2020) are encoded in the same manner, but this time it’s the “volume property flags” that are interesting.
The TOC path (0x2000) is only used if there are multiple volumes between the target and the filesystem root. In that case, it contains an array, with every other item holding a TOC ID for a dictionary describing a volume; the values between TOC IDs appear to be zero. The array starts from the filesystem root.
The URL length array (0xe003) is used to indicate how the path components were originally broken up; if the URL encoded by the bookmark has a base URL, each entry in the length array gives the number of path elements that come from that base URL.