This document is written for consumption by anyone who has written a BBEdit language module, either codeless or compiled. It documents the changes in the language module API as well as information that is essential for developing language modules that make the most of the improvements in BBEdit 11.0 and later.
This document supplements the information provided in the Codeless Language Module Reference as well as the information in the "Writing Language Modules" document included as part of the BBEdit SDK.
Use this one weird trick to debug your compiled language module in BBEdit with minimal effort:
In your language module project's "Run" scheme (in Xcode), go to the "Info" tab, and from the Executable popup, choose "Ask on Launch".
Then, go to the Arguments tab, and add an argument as follows:
--debugLanguageModule $(BUILT_PRODUCTS_DIR)/$(WRAPPER_NAME)
Now choose "Run". Xcode will ask you to choose the application
to run. Choose BBEdit. (Note that it must not be running.) Xcode
will then launch BBEdit with the --debugLanguageModule
argument
as provided above, which tells it to load your language module
from its build location. You can then debug it in place.
Language modules support two new property list keys:
BBLMSpellableRunKinds
and BBLMNonSpellableRunKinds
. Between
them these keys can eliminate the need for modules to implement
the kBBLMCanSpellCheckRunMessage
message, and the appropriate
use of these keys adds considerable flexibility.
Each of these keys is an array, listing the run kinds that can
(or cannot) be inspected by the spell checker. The test is
exclusionary: BBLMNonSpellableRunKinds
is checked first, and
if a run is found there, it is not spell checked. If a run is not
found in BBLMNonSpellableRunKinds
, then BBLMSpellableRunKinds
is checked. Either or both of these arrays may contain wildcards.
A common case is to have BBLMNonSpellableRunKinds
contain an
appropriate list of runs, and BBLMSpellableRunKinds
contains a
single entry: "". Thus, specific run kinds listed in
BBLMNonSpellableRunKinds
are not spell checked, and all* other
run kinds are. This is a useful and typical construction for
text-oriented language modules, such as TeX and Markdown,
thus:
<key>BBLMNonSpellableRunKinds</key>
<array>
<string>com.barebones.bblm.TeX.verbatim</string>
<string>com.barebones.bblm.TeX.inline-verbatim</string>
<string>com.barebones.bblm.TeX.command</string>
<string>com.barebones.bblm.TeX.math-string</string>
<string>com.barebones.bblm.TeX.delimiter-start</string>
<string>com.barebones.bblm.TeX.delimiter-stop</string>
<string>com.barebones.bblm.TeX.param-command</string>
<string>com.barebones.bblm.TeX.param-math-string</string>
<string>com.barebones.bblm.TeX.param-string-command</string>
</array>
<key>BBLMSpellableRunKinds</key>
<array>
<string>*</string>
</array>
Typically, a programming or scripting language will want to allow spell checking in comments, but not elsewhere. For example, in the Python module:
<key>BBLMSpellableRunKinds</key>
<array>
<string>com.barebones.bblm.line-comment</string>
<string>com.barebones.bblm.block-comment</string>
</array>
<key>BBLMNonSpellableRunKinds</key>
<array>
<string>com.barebones.bblm.code</string>
<string>com.barebones.bblm.double-string</string>
</array>
Either of these arrays may be absent or empty. Note that if a match
is not found (and this includes the case in which
BBLMSpellableRunKinds
and/or BBLMNonSpellableRunKinds
is absent
or empty), then BBEdit will still call the language module. If the
module does not implement kBBLMCanSpellCheckRunMessage
, then the
run is not checked.
Made a change to the language module support for
BBLMSpellableRunKinds
and BBLMNonSpellableRunKinds
, namely:
if at least one of these is present, the application will not
call the language module with kBBLMCanSpellCheckRunMessage
; and
so the keys should be complete as needed. If either key is absent
or fails to match the run kind, the behavior is unspecified (but
the application will always try to behave predictably).
Language modules support two new property list keys:
BBLMCompletableRunKinds
and BBLMNonCompletableRunKinds
. Between
them these keys eliminate the need for modules to implement
the kBBLMFilterRunForTextCompletion
message, and the appropriate
use of these keys adds considerable flexibility.
Each of these keys is an array, listing the run kinds that can (or
cannot) be tokenized for autocompletion. The test is exclusionary:
BBLMNonCompletableRunKinds
is checked first, and if a run is
found there, it is not tokenized. If a run is not found in
BBLMNonCompletableRunKinds
, then BBLMCompletableRunKinds
is
checked. Either or both of these arrays may contain wildcards.
Either of these arrays may be absent or empty. Note that if at
least one of these is present, the application will not call the
language module with kBBLMFilterRunForTextCompletion
; and so the
keys should be complete as needed. If either key is absent or fails
to match the run kind, the behavior is unspecified (but the
application will always try to behave predictably).
The old kBBLMMatchKeywordMessage
message is no longer sent to
compiled language modules; only
kBBLMMatchKeywordWithCFStringMessage
is used, with a
CFStringRef parameter.
Language modules can now specify arbitrary sets of
keywords, each grouped by the run kind that should be used to
color them. The BBLMKeywords
key is an array of dictionaries.
In each dictionary, there is a RunKind
key that specifies the
run kind to be used (one of the factory-supplied run kinds, or
one defined in your language module's BBLMRunColors
array), and
either a Keywords
key whose value is an array of keywords to be
colored using that run kind, or a KeywordFileName
key which
refers to a file in the language module's bundle (for compiled
modules).
So, for example, the BBLMKeywords
list looks like this for
the built-in PHP language module:
<key>BBLMKeywords</key>
<array>
<dict>
<key>RunKind</key> <string>com.barebones.bblm.keyword</string>
<key>KeywordFileName</key> <string>PHP Keywords.txt</string>
</dict>
<dict>
<key>RunKind</key> <string>com.barebones.bblm.predefined-symbol</string>
<key>KeywordFileName</key> <string>PHP Predefined Names.txt</string>
</dict>
</array>
Alternatively, you could write something like this:
<key>BBLMKeywords</key>
<array>
<dict>
<key>RunKind</key> <string>com.barebones.bblm.keyword</string>
<key>Keywords</key>
<array>
<string>abstract</string>
<string>and</string>
<string>array</string>
<string>as</string>
<string>break</string>
<string>case</string>
<string>catch</string>
<string>cfunction</string>
<string>class</string>
<string>clone</string>
<!-- and so on ... -->
</array>
</dict>
<dict>
<key>RunKind</key> <string>com.barebones.bblm.predefined-symbol</string>
<key>KeywordFileName</key> <string>PHP Predefined Names.txt</string>
</dict>
</array>
The run kinds you can use are not limited to the built-in ones; you
can define your own run kinds and color mappings using a
BBLMRunColors
key, as previously described. You must also add a
BBLMRunNames
key which maps those run kinds to human-readable
names, so that users can adjust the color settings.
Note that BBLMKeywords
supersedes the four old keys, which are
still supported but should no longer be used:
BBLMKeywordList
BBLMKeywordFileName
BBLMPredefinedNameList
BBLMPredefinedNameFileName
kBBLMMatchKeywordWithCFStringMessage
and
kBBLMMatchPredefinedNameMessage
are no longer sent to language
modules, and BBLMSupportsCFStringKeywordLookups
and
BBLMSupportsPredefinedNameLookups
are no longer used in module
plists. Instead, there's a new key, BBLMSupportsWordLookup
,
which triggers the sending of a new message:
kBBLMRunKindForWordMessage
. This allows arbitrary mapping at
runtime of words to run kinds, which in turn provides additional
flexibility for coloring.
Static listing of keyword-to-run-kind mapping in the module plist is
still desirable (because it's faster), but for situations where the
test must be done at runtime based on certain string
transformations, implementing kBBLMRunKindForWordMessage
support
is an appropriate solution.
The parameters to this message are (input) the potential keyword,
and (output) the run kind that should be used to color the word. (If
the word is not known, return nil
.)
Language modules may now use an (optional) key:
BBLMKeywordPatterns
. This key contains an array of
dictionaries, each with two key/value pairs. The first key,
RunKind
, contains the name of the run (in the module's name
space, or one of the factory-defined run kinds). The second key,
Pattern
, contains a Grep pattern which is used to match the
keyword. For example:
<key>BBLMKeywordPatterns</key>
<array>
<dict>
<key>RunKind</key>
<string>com.example.bblm.fo</string>
<key>Pattern</key>
<string>fo.*</string>
</dict>
<dict>
<key>RunKind</key>
<string>com.example.bblm.fa</string>
<key>Pattern</key>
<string>fa.*</string>
</dict>
<dict>
<key>RunKind</key>
<string>com.example.bblm.fl</string>
<key>Pattern</key>
<string>fl.*</string>
</dict>
</array>
If the module has no static BBLMKeywords
entry, or if the word
being examined fails to match an entry in the BBLMKeywords
entry,
then BBEdit will attempt to match the keyword against one of the
patterns. If a match is found, the appropriate run kind is generated
for coloring.
Codeless language modules now support a Number Pattern
key in the Language Features
property. The Number Pattern
key
may be omitted; if so, BBEdit will apply a default pattern which
matches integers, floating point numbers, and hexadecimal numbers
prefixed with 0x
.
Here is the default pattern, in a representation suitable for
inclusion in codeless language modules. For readability it's
formatted as an SGML CDATA
section and uses the (?x:...)
pattern
modifier for extended syntax, which allows comments and whitespace.
<![CDATA[(?x: (?# this just turns on extended syntax, which allows whitespace and comments)
(?<![\d\w.]) (?# must not be preceded by a digit or word char or period)
(?: (?# non-capturing group for alternation)
(?# version 1: hex notation like 0x0123456789abcdef)
(?:0x[[:xdigit:]]+) (?# the number written in hex form)
|
(?: (?# version 2: all other numbers, including whole numbers, decimals and exponentials)
[-+]? (?# optional plus or minus sign is included as part of the number)
(?: (?# non-capturing group for alternation)
\d+\.\d+ (?# version 2a: digits followed by a decimal followed by digits)
|
\d+ (?# version 2b: just digits)
)
(?: (?# optional exponent notation)
[eE][-+]? (?# with optional pos/neg)
\d+ (?# numeric portion of the exponent)
)?
)
)
(?=\b) (?# required word boundary after number. Here a decimal is fine.)
)]]>
Codeless language modules support a new key in the
Language Features
dictionary: Keyword Pattern
. This can be
used to specify runs of text that are to be colored using the
Keywords color, based on a Grep pattern. The intention is to
support languages with multi-word "keywords" which contain
word-break characters or white space; so the pattern you use
should be written accordingly. A pattern that matches across a
line boundary will probably produce unexpected results, so we
recommend using the non-greedy quantifiers when possible, or
character classes which don't include line breaks.
If a language module supplies a BBLMRunNameUIOrdering
key, the run kinds in that array are used in the specified order
to map names for the preferences UI. If no
BBLMRunNameUIOrdering
key is supplied, the keys in the
BBLMRunNames
array are sorted alphabetically for presentation
in the UI.
Beginning with BBEdit 12.5, plug-in language modules may specify custom badge information for use in the function menu.
Here's how it works:
The BBLMFunctionKinds
enumeration in BBLMInterface.h
provides a
pre-defined list of function kinds, and the range between
kBBLMFirstUserFunctionKind
and kBBLMLastUserFunctionKind
is
available for use by language modules.
When you call bblmAddFunctionToList()
or bblmUpdateFunctionEntry()
,
set the fKind
field of the function information to an value from the
range of built-in function kinds (kBBLMFunctionMark
through kBBLMLastUsedFunctionKind - 1
),
or use a value in the range of kBBLMFirstUserFunctionKind
through kBBLMFirstUserFunctionKind
.
Note that the range of user function kinds corresponds roughly to the printable ASCII range. This is intentional, because the next thing you'll do is add a section to your language module's language property list. Here is an example for Java:
<key>BBLMFunctionItemKinds</key>
<dict>
<key>P</key>
<dict>
<key>typeString</key>
<string>com.barebones.bblm.Java.package-decl</string>
<key>displayName</key>
<string>package declaration</string>
<key>labelBadgeShape</key>
<string>circle</string>
<key>labelCharacter</key>
<string>p</string>
<key>labelColorName</key>
<string>CodeSenseLightRed</string>
</dict>
</dict>
Each top-level key in the BBLMFunctionItemKinds
dictionary
corresponds to the character value that you used for the function
information's fKind
field. Thus, making it a printable ASCII
character is useful for various reasons. The key is required to be a
single character.
For each function kind, the values are as follows:
typeString
: (required) a reverse-domain description of the function type.
(required) The form is similar to that used for custom run kinds
that you generate: should begin with your plug-in's bundle
identifier.
displayName
: (required) a brief human-readable description of the function
type.
labelBadgeShape
: (optional) describes the shape of the badge that
appears in the function menu. Allowed values are default
,
square
, circle
, triangle
, and roundRect
. If this key is
absent, default
is used.
labelCharacter
: (optional) tells BBEdit what character to use in
the badge. If this is absent, BBEdit will use the character value of
the item kind's key (in the example above, this would be P
).
labelColorName
: (optional) tells BBEdit what background color to use
for the badge. You may use any CSS3 color name; the following built-in
colors are also provided:
`CodeSenseLightBlue`
`CodeSenseLightRed`
`CodeSenseLightGreen`
`CodeSenseLightPurple`
`MarkerBadgeColor`
`CodeSenseOrange`
`BBEditDarkPurple`
If this key is absent, BBEdit will use CodeSenseLightBlue
.
Note: Use the built-in function kinds whenever possible. For
example, if your language has the notion of an object class, use
kBBLMFunctionClassDeclaration
, kBBLMFunctionClassInterface
, or
kBBLMFunctionClassImplementation
as appropriate, rather than
creating your own badge.
Also, do not attempt to override the built-in mappings.
Beginning with BBEdit 13.0, compiled language modules now have the ability to generate and use their own document-specific data. (Unless you're writing a compiled language module, you can skip this note.)
This can be for any suitable purpose; for example, if a hypothetical
C-family language module wanted to generate an abstract syntax tree
for the document using clang
, it could do so.
BBEdit does not inspect or use any data created by the language
module, nor does it inspect it nor make any assumptions about what's
in it. The only rule is that it will be treated as an NSObject
and passed through the API boundary as such, but the language module
can instantiate it as any NSObject
subclass (including one defined
by the module itself) and assume that it will be of that type.
The main BBLMParamBlock
structure gains the following top-level
fields:
fDocumentParseData
: the module-generated data object for this document
fOutDocumentParseDataIsNew
: if the module creates a new data object for this document, it should set fDocumentParseData
to the new object value, and set fDocumentParseDataIsNew
to true
.
fDocumentIdentifier
: a unique identifier for the document. The language module can use this to keep track of data for different documents, for the lifetime of the application
fDocumentLocation
: if not nil
, provides the location of the document's backing file on disk. Note: you cannot assume that the document data on disk is consistent with what's in memory. You should always (and continue) to rely on the data provided by fText/fTextLength as authoritative.
There are four new messages relating to the management and lifetime of parse data:
kBBLMInitParseDataMessage
: When this is called, the language module may allocate any data specific to this document. Note that doing so is not required; you could certainly wait until you receive a kBBLMRecalculateParseDataMessage
to do so.
kBBLMDisposeParseDataMessage
: When this is called, the language module should deallocate any data contained in fDocumentParseData
, in the case that it is not intrinsically reference-counted. (Read below for more on this.)
kBBLMRecalculateParseDataMessage
: When this is called, the language module may calculate from scratch and return any appropriate parse data for the document. fDocumentParseData
will be the result of a previous kBBLMInitParseDataMessage
. If you opted not to do anything previously, then fDocumentParseData
will be nil
on entry; you should create it as needed, return it in fDocumentParseData
, and set fOutDocumentParseDataIsNew
to true
.
kBBLMUpdateParseDataMessage
: When this is called, the parameter block's fUpdateParseDataParams
member contains information about the location and nature of the change. You can use this information to incrementally recalculate your parse data; or you can recalculate it all from scratch as though you had received a kBBLMRecalculateParseDataMessage
. If you decide to recalculate from scratch and create a new parse data object, put it in fDocumentParseData
and set fOutDocumentParseDataIsNew
to true
.
Important Notes About Object Lifetimes
Under no circumstances should you attempt to assume ownership of
the NSObject
subclass that you return in fDocumentParseData
,
even if you are changing its value and setting
fOutDocumentParseDataIsNew
. If you return a new parse data object,
BBEdit will release the old one for you.
Considerations for non-refcounted data
In some cases, your parse data might be a C++ class instance, or
even an allocated C structure. In order to pass it back and forth
across the API boundary, you must wrap it in an NSValue
as a
pointer value. In that case, you must also take some care to
manage the object lifetime yourself, since BBEdit can't otherwise
know what needs to be done with it. Thus, given some hypothetical
ParseTree
C++ class, you would write something like:
myParseTree = new ParseTree;
/* ...do some parsing... */
params.fDocumentParseData = [NSValue valueWithPointer: myParseTree];
params.fDocumentParseDataIsNew = true;
You would use this pattern in response to kBBLMInitParseDataMessage
,
but also if you calculated a new parse tree in response to kBBLMRecalculateParseDataMessage
or kBBLMUpdateParseDataMessage
.
One additional wrinkle, though: when recalculating or updating,
if you make a new C++ object, you need to dispose of the old one,
but not release the NSValue
instance itself. This is because
BBEdit doesn't know what's wrapped up in the NSValue
, or how it
should be managed.
So in the case where you're changing the object during update or recalculate, you'd have code like this:
ParseTree *oldParseTree = NULL;
ParseTree *newParseTree = NULL;
oldParseTree = static_cast<ParseTree*>(params.fDocumentParseData.pointerValue);
delete oldParseTree; // clean up the old data
myParseTree = new ParseTree;
/* ...do some parsing... */
params.fDocumentParseData = [NSValue valueWithPointer: myParseTree];
params.fDocumentParseDataIsNew = true;
When receiving a kBBLMDisposeParseDataMessage
, you'll have to do the same:
ParseTree *oldParseTree = NULL;
oldParseTree = static_cast<ParseTree*>(params.fDocumentParseData.pointerValue);
delete oldParseTree; // clean up the old data
Note that you do not ever release params.fDocumentParseData
!
BBEdit will manage it for you once you've created it. (If you do
release it, you'll rapidly find out what a bad idea that was.)