When self-hosting Sourcebot, you must provide it a config file. This is done by defining a config file in a volume that’s mounted to Sourcebot, and providing the path to this file in the CONFIG_PATH environment variable. For example:
Passing in a CONFIG_PATH to Sourcebot
docker run \
    -v $(pwd)/config.json:/data/config.json \
    -e CONFIG_PATH=/data/config.json \
    ... \ # other options
    ghcr.io/sourcebot-dev/sourcebot:latest
The config file tells Sourcebot which repos to index, what language models to use, and various other settings as defined in the schema.

Config File Schema

The config file you provide Sourcebot must follow the schema. This schema consists of the following properties:
  • Connections (connections): Defines a set of connections that tell Sourcebot which repos to index and from where
  • Language Models (models): Defines a set of language model providers for use with Ask Sourcebot
  • Settings (settings): Additional settings to tweak your Sourcebot deployment
  • Search Contexts (contexts): Groupings of repos that you can search against

Config File Syncing

Sourcebot syncs the config file on startup, and automatically whenever a change is detected.

Settings

The following are settings that can be provided in your config file to modify Sourcebot’s behavior
SettingTypeDefaultMinimumDescription / Notes
maxFileSizenumber2 MB1Maximum size (bytes) of a file to index. Files exceeding this are skipped.
maxTrigramCountnumber20 0001Maximum trigrams per document. Larger files are skipped.
reindexIntervalMsnumber1 hour1Interval at which all repositories are re‑indexed.
resyncConnectionIntervalMsnumber24 hours1Interval for checking connections that need re‑syncing.
resyncConnectionPollingIntervalMsnumber1 second1DB polling rate for connections that need re‑syncing.
reindexRepoPollingIntervalMsnumber1 second1DB polling rate for repos that should be re‑indexed.
maxConnectionSyncJobConcurrencynumber81Concurrent connection‑sync jobs.
maxRepoIndexingJobConcurrencynumber81Concurrent repo‑indexing jobs.
maxRepoGarbageCollectionJobConcurrencynumber81Concurrent repo‑garbage‑collection jobs.
repoGarbageCollectionGracePeriodMsnumber10 seconds1Grace period to avoid deleting shards while loading.
repoIndexTimeoutMsnumber2 hours1Timeout for a single repo‑indexing run.
enablePublicAccess (deprecated)booleanfalseUse the FORCE_ENABLE_ANONYMOUS_ACCESS environment variable instead.