Home Download Docs Code Community
import "perkeep/pkg/importer"
Overview
Index
Subdirectories

Overview ▾

Package importer imports content from third-party websites.

Index

Constants
func All() map[string]Importer
func Register(name string, im Importer)
func RegisterTODO(name string, p Properties)
type Host
    func NewHost(hc HostConfig) (*Host, error)
    func (h *Host) AccountsStatus() (interface{}, []camtypes.StatusError)
    func (h *Host) BaseURL() string
    func (h *Host) BlobSource() blob.Fetcher
    func (h *Host) HTTPClient() *http.Client
    func (h *Host) HTTPTransport() http.RoundTripper
    func (h *Host) ImporterBaseURL() string
    func (h *Host) InitHandler(hl blobserver.FindHandlerByTyper) error
    func (h *Host) NewObject() (*Object, error)
    func (h *Host) ObjectFromRef(permanodeRef blob.Ref) (*Object, error)
    func (h *Host) RunImporterAccount(importerType string, accountNode blob.Ref) error
    func (h *Host) Searcher() search.QueryDescriber
    func (h *Host) ServeHTTP(w http.ResponseWriter, r *http.Request)
    func (h *Host) Target() blobserver.StatReceiver
type HostConfig
type Importer
type ImporterSetupHTMLer
type LongPoller
type OAuth1
    func (OAuth1) CallbackRequestAccount(r *http.Request) (blob.Ref, error)
    func (OAuth1) CallbackURLParameters(acctRef blob.Ref) url.Values
type OAuth2
    func (OAuth2) CallbackRequestAccount(r *http.Request) (blob.Ref, error)
    func (OAuth2) CallbackURLParameters(acctRef blob.Ref) url.Values
    func (OAuth2) IsAccountReady(acctNode *Object) (ok bool, err error)
    func (OAuth2) RedirectState(imp Importer, ctx *SetupContext) (state string, err error)
    func (OAuth2) RedirectURL(imp Importer, ctx *SetupContext) string
    func (im OAuth2) SummarizeAccount(acct *Object) string
type OAuthContext
    func (octx OAuthContext) Get(url string, form url.Values) (*http.Response, error)
    func (octx OAuthContext) POST(url string, form url.Values) (*http.Response, error)
    func (octx OAuthContext) PopulateJSONFromURL(result interface{}, method string, apiURL string, keyval ...string) error
type OAuthURIs
type Object
    func (o *Object) Attr(attr string) string
    func (o *Object) Attrs(attr string) []string
    func (o *Object) ChildPathObject(path string) (*Object, error)
    func (o *Object) ChildPathObjectOrFunc(path string, fn func() (*Object, error)) (*Object, error)
    func (o *Object) DelAttr(key, value string) error
    func (o *Object) ForeachAttr(fn func(key, value string))
    func (o *Object) PermanodeRef() blob.Ref
    func (o *Object) SetAttr(key, value string) error
    func (o *Object) SetAttrValues(key string, attrs []string) error
    func (o *Object) SetAttrs(keyval ...string) error
    func (o *Object) SetAttrs2(keyval ...string) (changes bool, err error)
type ProgressMessage
type Properties
type RunContext
    func CreateAccount(h *Host, impl string) (*RunContext, error)
    func (rc *RunContext) AccountNode() *Object
    func (rc *RunContext) Context() context.Context
    func (rc *RunContext) Credentials() (clientID, clientSecret string, err error)
    func (rc *RunContext) RootNode() *Object
type SetupContext
    func (sc *SetupContext) AccountURL() string
    func (sc *SetupContext) CallbackURL() string
    func (sc *SetupContext) Credentials() (clientID, clientSecret string, err error)
    func (ctx *SetupContext) NewOAuthClient(uris OAuthURIs) (*oauth.Client, error)
type TestDataMaker

Package files

attrs.go html.go importer.go noop.go oauth.go

Constants

const (

    // AcctAttrUserID is the account's internal representation, and often an account number.
    // It is usually required as an argument in API calls to the site we import from.
    // Not found on schema.org.
    // Example: "3179713".
    AcctAttrUserID = "userID"
    // AcctAttrUserName is the public identifier of the account. Commonly referred to as
    // "username", or "screen name", or "account name". Often a one word string.
    // Not found on schema.org.
    // Example: "johnSmith" from Twitter's "@johnSmith".
    AcctAttrUserName = "userName"

    // AcctAttrCompletedVersion records the importer-specific
    // "version number" that last ran to completion, doing a full
    // importer. When importers are updated with new behavior,
    // they update their version number and that triggers a full
    // import, rather than incremental imports.
    AcctAttrCompletedVersion = "completedVersion"

    // AcctAttrName is a longer or alternate public representation of the account's name.
    // It is often the full name of the person's account (family name and given name), thus
    // sometimes redundant with the combination of acctAttrFamilyName and acctAttrGivenName.
    // Found at http://schema.org/Person.
    // Example: "John Smith".
    AcctAttrName = "name"
    // http://schema.org/givenName
    // Example: "John".
    AcctAttrGivenName = "givenName"
    // http://schema.org/familyName
    // Example: "Smith".
    AcctAttrFamilyName = "familyName"

    // ItemAttrID is the generic identifier of an item when nothing suitable and more specific
    // was found on http://schema.org. Usually a number.
    AttrID = "ID"
    // http://schema.org/name
    AttrName = "name"
    // Free-flowing text definition of a location or place, such
    // as a city name, or a full postal address.
    AttrLocationText = "locationText"
    // AttrURL is the item's original or origin URL.
    AttrURL = "url"

    // AttrStartDate is http://schema.org/startDate: The start
    // date and time of the event or item (in ISO 8601 date
    // format)
    AttrStartDate = "startDate"
)

TODO(mpl): use these on all the importers.

const (
    AcctAttrTempToken         = "oauthTempToken"
    AcctAttrTempSecret        = "oauthTempSecret"
    AcctAttrAccessToken       = "oauthAccessToken"
    AcctAttrAccessTokenSecret = "oauthAccessTokenSecret"
)

func All

func All() map[string]Importer

All returns the map of importer implementation name to implementation. This map should not be mutated.

func Register

func Register(name string, im Importer)

Register registers a site-specific importer. It should only be called from init, and not from concurrent goroutines.

func RegisterTODO

func RegisterTODO(name string, p Properties)

type Host

type Host struct {
    // contains filtered or unexported fields
}

Host is the HTTP handler and state for managing all the importers linked into the binary, even if they're not configured.

func NewHost

func NewHost(hc HostConfig) (*Host, error)

func (*Host) AccountsStatus

func (h *Host) AccountsStatus() (interface{}, []camtypes.StatusError)

AccountsStatus returns the currently configured accounts and their status for inclusion in the status.json document, as rendered by the web UI.

func (*Host) BaseURL

func (h *Host) BaseURL() string

BaseURL returns the root of the whole server, without trailing slash.

func (*Host) BlobSource

func (h *Host) BlobSource() blob.Fetcher

func (*Host) HTTPClient

func (h *Host) HTTPClient() *http.Client

HTTPClient returns the HTTP client to use.

func (*Host) HTTPTransport

func (h *Host) HTTPTransport() http.RoundTripper

HTTPTransport returns the HTTP transport to use.

func (*Host) ImporterBaseURL

func (h *Host) ImporterBaseURL() string

ImporterBaseURL returns the URL base of the importer handler, including trailing slash.

func (*Host) InitHandler

func (h *Host) InitHandler(hl blobserver.FindHandlerByTyper) error

func (*Host) NewObject

func (h *Host) NewObject() (*Object, error)

NewObject creates a new permanode and returns its Object wrapper.

func (*Host) ObjectFromRef

func (h *Host) ObjectFromRef(permanodeRef blob.Ref) (*Object, error)

ObjectFromRef returns the object given by the named permanode

func (*Host) RunImporterAccount

func (h *Host) RunImporterAccount(importerType string, accountNode blob.Ref) error

RunImporterAccount runs the importerType importer on the account described in accountNode.

func (*Host) Searcher

func (h *Host) Searcher() search.QueryDescriber

func (*Host) ServeHTTP

func (h *Host) ServeHTTP(w http.ResponseWriter, r *http.Request)

ServeHTTP serves:

http://host/importer/
http://host/importer/twitter/
http://host/importer/twitter/callback
http://host/importer/twitter/sha1-abcabcabcabcabc (single account)

func (*Host) Target

func (h *Host) Target() blobserver.StatReceiver

type HostConfig

type HostConfig struct {
    BaseURL      string
    Prefix       string                  // URL prefix for the importer handler
    Target       blobserver.StatReceiver // storage for the imported object blobs
    BlobSource   blob.Fetcher            // for additional resources, such as twitter zip file
    Signer       *schema.Signer
    Search       search.QueryDescriber
    ClientId     map[string]string // optionally maps importer impl name to a clientId credential
    ClientSecret map[string]string // optionally maps importer impl name to a clientSecret credential

    // HTTPClient optionally specifies how to fetch external network
    // resources. The Host will use http.DefaultClient otherwise.
    HTTPClient *http.Client
}

HostConfig holds the parameters to set up a Host.

type Importer

type Importer interface {
    // Run runs a full or incremental import.
    //
    // The importer should continually or periodically monitor the
    // RunContext.Context()'s Done channel to exit early if
    // requested. The return value should be ctx.Err() if the
    // importer exits for that reason.
    Run(*RunContext) error

    // Properties returns properties of this importer type.
    Properties() Properties

    // IsAccountReady reports whether the provided account node
    // is configured.
    IsAccountReady(acctNode *Object) (ok bool, err error)
    SummarizeAccount(acctNode *Object) string

    ServeSetup(w http.ResponseWriter, r *http.Request, ctx *SetupContext) error
    ServeCallback(w http.ResponseWriter, r *http.Request, ctx *SetupContext)

    // CallbackRequestAccount extracts the blobref of the importer account from
    // the callback URL parameters of r. For example, it will be encoded as:
    // For Twitter (OAuth1), in its own URL parameter: "acct=sha1-f2b0b7da718b97ce8c31591d8ed4645c777f3ef4"
    // For Picasa: (OAuth2), in the OAuth2 "state" parameter: "state=acct:sha1-97911b1a5887eb5862d1c81666ba839fc1363ea1"
    CallbackRequestAccount(r *http.Request) (acctRef blob.Ref, err error)

    // CallbackURLParameters uses the input importer account blobRef to build
    // and return the URL parameters, that will be appended to the callback URL.
    CallbackURLParameters(acctRef blob.Ref) url.Values
}

An Importer imports from a third-party site.

type ImporterSetupHTMLer

type ImporterSetupHTMLer interface {
    AccountSetupHTML(*Host) string
}

ImporterSetupHTMLer is an optional interface that may be implemented by Importers to return some HTML to be included on the importer setup page.

type LongPoller

type LongPoller interface {
    Importer

    // LongPoll waits and returns nil when there's new content.
    // It does not fetch the content itself.
    // It returns a non-nil error if it failed to long poll.
    LongPoll(*RunContext) error
}

LongPoller is optionally implemented by importers which can long poll efficiently to wait for new content. For example, Twitter uses this to subscribe to the user's stream.

type OAuth1

type OAuth1 struct{}

OAuth1 provides methods that the importer implementations can use to help with OAuth authentication.

func (OAuth1) CallbackRequestAccount

func (OAuth1) CallbackRequestAccount(r *http.Request) (blob.Ref, error)

func (OAuth1) CallbackURLParameters

func (OAuth1) CallbackURLParameters(acctRef blob.Ref) url.Values

type OAuth2

type OAuth2 struct{}

OAuth2 provides methods that the importer implementations can use to help with OAuth2 authentication.

func (OAuth2) CallbackRequestAccount

func (OAuth2) CallbackRequestAccount(r *http.Request) (blob.Ref, error)

func (OAuth2) CallbackURLParameters

func (OAuth2) CallbackURLParameters(acctRef blob.Ref) url.Values

func (OAuth2) IsAccountReady

func (OAuth2) IsAccountReady(acctNode *Object) (ok bool, err error)

IsAccountReady returns whether the account has been properly configured - whether the user ID and access token has been stored in the given account node.

func (OAuth2) RedirectState

func (OAuth2) RedirectState(imp Importer, ctx *SetupContext) (state string, err error)

RedirectState returns the "state" query parameter that should be used for the authorization phase of OAuth2 authentication. This parameter contains the query component of the redirection URI. See http://tools.ietf.org/html/rfc6749#section-3.1.2.2

func (OAuth2) RedirectURL

func (OAuth2) RedirectURL(imp Importer, ctx *SetupContext) string

RedirectURL returns the redirect URI that imp should set in an oauth.Config for the authorization phase of OAuth2 authentication.

func (OAuth2) SummarizeAccount

func (im OAuth2) SummarizeAccount(acct *Object) string

SummarizeAccount returns a summary for the account if it is configured, or an error string otherwise.

type OAuthContext

type OAuthContext struct {
    Ctx    context.Context
    Client *oauth.Client
    Creds  *oauth.Credentials
}

OAuthContext wraps the OAuth1 state needed to perform API calls.

It is used as a value type.

func (OAuthContext) Get

func (octx OAuthContext) Get(url string, form url.Values) (*http.Response, error)

func (OAuthContext) POST

func (octx OAuthContext) POST(url string, form url.Values) (*http.Response, error)

func (OAuthContext) PopulateJSONFromURL

func (octx OAuthContext) PopulateJSONFromURL(result interface{}, method string, apiURL string, keyval ...string) error

PopulateJSONFromURL makes a POST or GET call at apiURL, using keyval as parameters of the associated form. The JSON response is decoded into result.

type OAuthURIs

type OAuthURIs struct {
    TemporaryCredentialRequestURI string
    ResourceOwnerAuthorizationURI string
    TokenRequestURI               string
}

OAuthURIs holds the URIs needed to initialize an OAuth 1 client.

type Object

type Object struct {
    // contains filtered or unexported fields
}

An Object is wrapper around a permanode that the importer uses to synchronize.

func (*Object) Attr

func (o *Object) Attr(attr string) string

Attr returns the object's attribute value for the provided attr, or the empty string if unset. To distinguish between unset, an empty string, or multiple attribute values, use Attrs.

func (*Object) Attrs

func (o *Object) Attrs(attr string) []string

Attrs returns the attribute values for the provided attr.

func (*Object) ChildPathObject

func (o *Object) ChildPathObject(path string) (*Object, error)

ChildPathObject returns (creating if necessary) the child object from the permanode o, given by the "camliPath:xxxx" attribute, where xxx is the provided path.

func (*Object) ChildPathObjectOrFunc

func (o *Object) ChildPathObjectOrFunc(path string, fn func() (*Object, error)) (*Object, error)

ChildPathObjectOrFunc returns the child object from the permanode o, given by the "camliPath:xxxx" attribute, where xxx is the provided path. If the path doesn't exist, the provided func should return an appropriate object. If the func fails, the return error is returned directly without any attempt to make a permanode.

func (*Object) DelAttr

func (o *Object) DelAttr(key, value string) error

DelAttr removes value from the values set for the attribute attr of permaNode. If value is empty then all the values for attribute are cleared.

func (*Object) ForeachAttr

func (o *Object) ForeachAttr(fn func(key, value string))

ForeachAttr runs fn for each of the object's attributes & values. There might be multiple values for the same attribute. The internal lock is held while running, so no mutations should be made or it will deadlock.

func (*Object) PermanodeRef

func (o *Object) PermanodeRef() blob.Ref

PermanodeRef returns the permanode that this object wraps.

func (*Object) SetAttr

func (o *Object) SetAttr(key, value string) error

SetAttr sets the attribute key to value.

func (*Object) SetAttrValues

func (o *Object) SetAttrValues(key string, attrs []string) error

SetAttrValues sets multi-valued attribute.

func (*Object) SetAttrs

func (o *Object) SetAttrs(keyval ...string) error

SetAttrs sets multiple attributes. The provided keyval should be an even number of alternating key/value pairs to set.

func (*Object) SetAttrs2

func (o *Object) SetAttrs2(keyval ...string) (changes bool, err error)

SetAttrs2 sets multiple attributes and returns whether there were any changes. The provided keyval should be an even number of alternating key/value pairs to set.

type ProgressMessage

type ProgressMessage struct {
    ItemsDone, ItemsTotal int
    BytesDone, BytesTotal int64
}

type Properties

type Properties struct {
    Title       string
    Description string

    // TODOIssue, if non-zero, marks the importer as invalid, but the UI
    // will link to a tracking bug for implementing it.
    TODOIssue int

    // NeedsAPIKey reports whether this importer requires an API key
    // (OAuth2 client_id & client_secret, or equivalent).
    // If the API only requires a username & password, or a flow to get
    // an auth token per-account without an overall API key, importers
    // can return false here.
    NeedsAPIKey bool

    // SupportsIncremental reports whether this importer has been optimized
    // to run efficiently in regular incremental runs. (e.g. every 5 minutes
    // or half hour). Eventually all importers might support this and we'll
    // make it required, in which case we might delete this option.
    // For now, some importers (e.g. Flickr) don't yet support this.
    SupportsIncremental bool

    // PermanodeImporterType optionally specifies the "importerType"
    // permanode attribute value that should be stored for
    // accounts of this type. By default, it is the string that it
    // was registered with. This should only be specified for
    // products that have been rebranded, and then this should be
    // the old branding, to not break people who have been
    // importing the account since before the rebranding.
    // For example, this is "foursquare" for "swarm", so "swarm" shows
    // in the UI and URLs, but it's "foursquare" in permanodes.
    PermanodeImporterType string
}

Properties contains the properties of an importer type.

type RunContext

type RunContext struct {
    Host *Host
    // contains filtered or unexported fields
}

RunContext is the context provided for a given Run of an importer, importing a certain account on a certain importer.

func CreateAccount

func CreateAccount(h *Host, impl string) (*RunContext, error)

CreateAccount creates a new importer account for the Host h, and the importer implementation named impl. It returns a RunContext setup with that account.

func (*RunContext) AccountNode

func (rc *RunContext) AccountNode() *Object

AccountNode returns the permanode storing account information for this permanode. It will contain the attributes:

You must not change the camliNodeType or importerType.

You should use this permanode to store state about where your importer left off, if it can efficiently resume later (without missing anything).

func (*RunContext) Context

func (rc *RunContext) Context() context.Context

Context returns the run's context. It is always non-nil.

func (*RunContext) Credentials

func (rc *RunContext) Credentials() (clientID, clientSecret string, err error)

Credentials returns the credentials for the importer. This is typically the OAuth1, OAuth2, or equivalent client ID (api token) and client secret (api secret).

func (*RunContext) RootNode

func (rc *RunContext) RootNode() *Object

RootNode returns the initially-empty permanode storing the root of this account's data. You can change anything at will. This will typically be modeled as a dynamic directory (with camliPath:xxxx attributes), where each path element is either a file, object, or another dynamic directory.

type SetupContext

type SetupContext struct {
    context.Context
    Host        *Host
    AccountNode *Object
    // contains filtered or unexported fields
}

func (*SetupContext) AccountURL

func (sc *SetupContext) AccountURL() string

AccountURL returns the URL to an account of an importer (http://host/importer/TYPE/sha1-sd8fsd7f8sdf7).

func (*SetupContext) CallbackURL

func (sc *SetupContext) CallbackURL() string

func (*SetupContext) Credentials

func (sc *SetupContext) Credentials() (clientID, clientSecret string, err error)

func (*SetupContext) NewOAuthClient

func (ctx *SetupContext) NewOAuthClient(uris OAuthURIs) (*oauth.Client, error)

NewOAuthClient returns an oauth Client configured with uris and the credentials obtained from ctx.

type TestDataMaker

type TestDataMaker interface {
    MakeTestData() http.RoundTripper
    // SetTestAccount allows an importer to set some needed attributes on the importer
    // account node before a run is started.
    SetTestAccount(acctNode *Object) error
}

TestDataMaker is an optional interface that may be implemented by Importers to generate test data locally. The returned Roundtripper will be used as the transport of the HTTPClient, in the RunContext that will be passed to Run during tests and devcam server --makethings. (See http://perkeep.org/issue/417).

Subdirectories

Name      Synopsis
..
allimporters      Package allimporters registers all the importer implementations.
dummy      Package dummy is an example importer for development purposes.
feed      Package feed implements an importer for RSS, Atom, and RDF feeds.
     atom      Package atom defines XML data structures for an Atom feed.
     rdf      Package rdf defines XML data structures for an RDF feed.
     rss      Package rss defines XML data structures for an RSS feed.
flickr      Package flickr implements an importer for flickr.com accounts.
gphotos      Package gphotos implements a Google Photos importer, using the Google Drive API to access the Google Photos folder.
instapaper      Package instapaper implements a instapaper.com importer.
mastodon      Package mastodon provides an importer for servers using the Mastodon API.
picasa      Package picasa implements an importer for picasa.com accounts.
pinboard      Package pinboard imports pinboard.in posts.
plaid      Package plaid implements an importer for financial transactions from plaid.com
swarm      Package swarm implements an importer for Foursquare Swarm check-ins.
test      Package test provides common functionality for importer tests.
twitter      Package twitter implements a twitter.com importer.
Website layout inspired by memcached.
Content by the authors.