Home Download Docs Code Community
import "perkeep/pkg/blobserver/blobpacked"
Overview
Index

Overview ▾

Package blobpacked registers the "blobpacked" blobserver storage type, storing blobs initially as one physical blob per logical blob, but then rearranging little physical blobs into large contiguous blobs organized by how they'll likely be accessed. An index tracks the mapping from logical to physical blobs.

Example low-level config:

"/storage/": {
    "handler": "storage-blobpacked",
    "handlerArgs": {
       "smallBlobs": "/small/",
       "largeBlobs": "/large/",
       "metaIndex": {
          "type": "mysql",
           .....
       }
     }
}

The resulting large blobs are valid zip files. Those blobs may up be up to 16 MB and contain the original contiguous file (or fractions of it), as well as metadata about how the file is cut up. The zip file will have the following structure:

foo.jpg       (or whatever)
camlistore/sha1-beb1df0b75952c7d277905ad14de71ef7ef90c44.json (some file ref)
camlistore/sha1-a0ceb10b04403c9cc1d032e07a9071db5e711c9a.json (some bytes ref)
camlistore/sha1-7b4d9c8529c27d592255c6dfb17188493db96ccc.json (another bytes ref)
camlistore/camlistore-pack-manifest.json

The camlistore-pack-manifest.json is documented on the exported Manifest type. It looks like this:

{
  "wholeRef": "sha1-0e64816d731a56915e8bb4ae4d0ac7485c0b84da",
  "wholeSize": 2962227200, // 2.8GB; so will require ~176-180 16MB chunks
  "wholePartIndex": 17,    // 0-based
  "dataBlobsOrigin": "sha1-355705cf62a56669303d2561f29e0620a676c36e",
  "dataBlobs": [
      {"blob": "sha1-f1d2d2f924e986ac86fdf7b36c94bcdf32beec15", "offset": 0, "size": 273048},
      {"blob": "sha1-e242ed3bffccdf271b7fbaf34ed72d089537b42f", "offset": 273048, "size": 112783},
      {"blob": "sha1-6eadeac2dade6347e87c0d24fd455feffa7069f0", "offset": 385831, ...},
      {"blob": "sha1-9425cca1dde5d8b6eb70cd087db4e356da92396e", "offset": ...},
      {"blob": "sha1-7709559a3c8668c57cc0a2f57c418b1cc3598049", "offset": ...},
      {"blob": "sha1-f62cb5d05cfbf2a7a6c7f8339d0a4bf1dcd0ab6c", "offset": ...}
  ] // raw data blobs of foo.jpg
}

The manifest.json ensures that if the metadata index is lost, all the data can be reconstructed from the raw zip files.

The 'wholeRef' property specifies which large file that this zip is building up. If the file is less than 15.5 MB or so (leaving room for the zip overhead and manifest size), it will probably all be in one zip and the first file in the zip will be the whole thing. Otherwise it'll be cut across multiple zip files, each no larger than 16MB. In that case, each part of the file will have a different 'wholePartIndex' number, starting at index 0. Each will have the same 'wholeSize'.

Index

func SetRecovery(mode RecoveryMode)
type BlobAndPos
type Manifest
type RecoveryMode

Package files

blobpacked.go stream.go subfetch.go wholefetch.go

func SetRecovery

func SetRecovery(mode RecoveryMode)

SetRecovery sets the recovery mode for the blobpacked package. If set to one of the modes other than NoRecovery, it means that any blobpacked storage subsequently initialized will automatically start with rebuilding its meta index of zip files, in accordance with the selected mode.

type BlobAndPos

type BlobAndPos struct {
    blob.SizedRef
    Offset int64 `json:"offset"`
}

A BlobAndPos is a blobref, its size, and where it is located within a larger group of bytes.

type Manifest

type Manifest struct {
    // WholeRef is the blobref of the entire file that this zip is
    // either fully or partially describing.  For files under
    // around 16MB, the WholeRef and DataBlobsOrigin will be
    // the same.
    WholeRef blob.Ref `json:"wholeRef"`

    // WholeSize is the number of bytes in the original file being
    // cut up.
    WholeSize int64 `json:"wholeSize"`

    // WholePartIndex is the chunk number (0-based) of this zip file.
    // If a client has 'n' zip files with the same WholeRef whose
    // WholePartIndexes are contiguous (including 0) and the sum of
    // the DataBlobs equals WholeSize, the client has the entire
    // original file.
    WholePartIndex int `json:"wholePartIndex"`

    // DataBlobsOrigin is the blobref of the contents of the first
    // file in the zip pack file. This first file is the actual data,
    // or a part of it, that the rest of this zip is describing or
    // referencing.
    DataBlobsOrigin blob.Ref `json:"dataBlobsOrigin"`

    // DataBlobs describes all the logical blobs that are
    // concatenated together in the first file in the zip file.
    // The offsets are relative to the beginning of that first
    // file, not the beginning of the zip file itself.
    DataBlobs []BlobAndPos `json:"dataBlobs"`
}

Manifest is the JSON description type representing the "camlistore/camlistore-pack-manifest.json" file found in a blobpack zip file.

type RecoveryMode

type RecoveryMode int

RecoveryMode is the mode in which the blobpacked server starts.

const (
    // NoRecovery means blobpacked does not attempt to repair its index on startup.
    // It is the default.
    NoRecovery RecoveryMode = 0
    // FastRecovery populates the blobpacked index, without erasing any existing one.
    FastRecovery RecoveryMode = 1
    // FullRecovery erases the existing blobpacked index, then rebuilds it.
    FullRecovery RecoveryMode = 2
)

Note: not using iota for these, because they're stored in GCE instance's metadata values.

Website layout inspired by memcached.
Content by the authors.