source

package
v4.1.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 26, 2026 License: AGPL-3.0 Imports: 28 Imported by: 0

Documentation

Overview

Package source provides archive readers for different output formats.

Currently, the following formats are supported:

  • archive
  • database
  • dump
  • Slack Export

One should use Load function to load the source from the file system. It will automatically detect the format and return the appropriate reader.

All sources implement the Sourcer interface, which provides methods to retrieve data from the source. The Resumer interface is implemented by sources that can be resumed. The SourceResumeCloser interface is implemented by sources that can be resumed and closed.

There are two distinct error types of interest, ErrNotFound and ErrNotSupported. See their documentation for more information.

What is a Source?

A source is a generic interface over the data source. It can be implemented not only around Slack data, but also around any other messenger data, as it represents common entities every messenger has, for example "message", "thread", "channel", for Telegram would be "message", "reply", "chat" respectively. The caveat is that non-Slack source's entities would need to be converted to Slack entities, which shouldn't be hard to do, unless you're aiming to replicate formatting.

In this package, for now, the source is implemented only for data originating from Slackdump and Slack.

Loading a Source

If you know what source you are loading, or need a particular source type you can use a concrete Open* function, such as OpenDatabase. If you don't, you can use Load function that will determine the source type and return the appropriate source.

It is a good idea to defer the closing of the source, as it may be a database connection or a file handle. The SourceResumeCloser interface, returned by Load implements io.Closer interface, so you can use it with defer statement. For example:

src, err := source.Load(ctx, "path/to/source")
if err != nil {
    log.Fatal(err)
}
defer src.Close()

Within your code, you can call the Type method which will return the source type.

Source Types

The source type returned by Type is a bitmask, you can use [Has] method to check if particular flag is set. Flag constants all start with F* and have Flags type.

The source type returned by Type method on a particular source will only have the type flag set, without any additional flags (as of v3.1.0, but this may change in future versions, so always use Has method to be on the safe side).

Index

Constants

View Source
const DefaultDBFile = "slackdump.sqlite"

Variables

View Source
var (
	// ErrNotSupported is returned if the method is not supported.
	ErrNotSupported = errors.New("method not supported")
	// ErrNotFound is returned if the data is missing or not found.
	ErrNotFound = errors.New("no data found")
)
View Source
var ErrUnknownLinkType = errors.New("unknown link type")

Functions

func AvatarParams

func AvatarParams(u *slack.User) (userID string, filename string)

AvatarParams is a convenience function that returns the user ID and the base name of the original avatar filename to be passed to AvatarStorage.File function. For example:

var as *AvatarStorage
var u *slack.User
fmt.Println(as.File(AvatarParams(u)))

func DumpFilepath

func DumpFilepath(ci *slack.Channel, f *slack.File) string

DumpFilepath returns the path to the file within the channel directory.

func ExportChanName

func ExportChanName(ch *slack.Channel) string

ExportChanName returns the channel name, or the channel ID if it is a DM.

func MattermostFilepath

func MattermostFilepath(_ *slack.Channel, f *slack.File) string

MattermostFilepath returns the path to the file within the __uploads directory.

func MattermostFilepathWithDir

func MattermostFilepathWithDir(dir string) func(*slack.Channel, *slack.File) string

MattermostFilepathWithDir returns the path to the file within the given directory, but it follows the mattermost naming pattern. In most cases you don't need to use this function.

func SanitizeFilename

func SanitizeFilename(name string) string

SanitizeFilename ensures the filename is safe for all OSes, especially Windows.

func StdFilepath

func StdFilepath(ci *slack.Channel, f *slack.File) string

StdFilepath returns the path to the file within the "attachments" directory.

Types

type AvatarStorage

type AvatarStorage struct {
	// contains filtered or unexported fields
}

func NewAvatarStorage

func NewAvatarStorage(fsys fs.FS) (*AvatarStorage, error)

func (*AvatarStorage) FS

func (r *AvatarStorage) FS() fs.FS

func (*AvatarStorage) File

func (r *AvatarStorage) File(userID string, imageOriginalBase string) (string, error)

func (*AvatarStorage) FilePath

func (r *AvatarStorage) FilePath(_ *slack.Channel, _ *slack.File) string

FilePath is unused on AvatarStorage.

func (*AvatarStorage) Type

func (r *AvatarStorage) Type() StorageType

type ChunkDir

type ChunkDir struct {
	// contains filtered or unexported fields
}

ChunkDir is the chunk directory source.

TODO: create an index of entries, otherwise it does the full scan of the directory.

func OpenChunkDir

func OpenChunkDir(d *chunk.Directory, fast bool) *ChunkDir

OpenChunkDir creates a new ChurkDir source. It expects the attachments to be in the mattermost storage format. If the attachments are not in the mattermost storage format, it will assume they were not downloaded.

func (*ChunkDir) AllMessages

func (c *ChunkDir) AllMessages(ctx context.Context, channelID string) (iter.Seq2[slack.Message, error], error)

AllMessages returns all messages for the channel. Current restriction - it expects for all messages for the requested file to be in the file ID.json.gz. If messages for the channel are scattered across multiple file, it will not return all of them.

func (*ChunkDir) AllThreadMessages

func (c *ChunkDir) AllThreadMessages(ctx context.Context, channelID, threadID string) (iter.Seq2[slack.Message, error], error)

func (*ChunkDir) Avatars

func (c *ChunkDir) Avatars() Storage

func (*ChunkDir) ChannelInfo

func (c *ChunkDir) ChannelInfo(_ context.Context, channelID string) (*slack.Channel, error)

ChannelInfo accepts the fileID (so it can treat channel or thread exports equally). If in doubt, use channelID as the fileID.

func (*ChunkDir) Channels

func (c *ChunkDir) Channels(ctx context.Context) ([]slack.Channel, error)

func (*ChunkDir) Close

func (c *ChunkDir) Close() error

func (*ChunkDir) Files

func (c *ChunkDir) Files() Storage

func (*ChunkDir) Latest

func (c *ChunkDir) Latest(ctx context.Context) (map[structures.SlackLink]time.Time, error)

func (*ChunkDir) Name

func (c *ChunkDir) Name() string

func (*ChunkDir) Sorted

func (c *ChunkDir) Sorted(ctx context.Context, id string, desc bool, cb func(ts time.Time, msg *slack.Message) error) error

func (*ChunkDir) ToChunk

func (c *ChunkDir) ToChunk(ctx context.Context, enc chunk.Encoder, _ int64) error

func (*ChunkDir) Type

func (c *ChunkDir) Type() Flags

func (*ChunkDir) Users

func (c *ChunkDir) Users(context.Context) ([]slack.User, error)

func (*ChunkDir) WorkspaceInfo

func (c *ChunkDir) WorkspaceInfo(context.Context) (*slack.AuthTestResponse, error)

type Database

type Database struct {
	*dbase.Source
	// contains filtered or unexported fields
}

Database represents a database source. It implements the Sourcer interface and provides access to the database data. It also provides access to the files and avatars storage, if available. The database source is created by calling OpenDatabase function.

func DatabaseWithSource

func DatabaseWithSource(source *dbase.Source) *Database

DatabaseWithSource returns a new database source with the given database processor source. It will not have any files or avatars storage. In most cases you should use OpenDatabase instead, unless you know what you are doing.

func OpenDatabase

func OpenDatabase(ctx context.Context, path string) (*Database, error)

OpenDatabase attempts to open the database at given path. It supports both types - when database file is given directly, and when the path is a directory containing the "slackdump.sqlite" file. In the latter case, it will also attempt to open the mattermost storage, and if no storage is found, it will return a special NoStorage type, which returns fs.ErrNotExist for all file operations.

func (*Database) Avatars

func (d *Database) Avatars() Storage

func (*Database) Channels

func (d *Database) Channels(ctx context.Context) ([]slack.Channel, error)

func (*Database) Files

func (d *Database) Files() Storage

func (*Database) Name

func (d *Database) Name() string

func (*Database) Type

func (d *Database) Type() Flags

func (*Database) WorkspaceInfo

func (d *Database) WorkspaceInfo(ctx context.Context) (*slack.AuthTestResponse, error)

type Dump

type Dump struct {
	// contains filtered or unexported fields
}

func OpenDump

func OpenDump(ctx context.Context, fsys fs.FS, name string) (*Dump, error)

OpenDump opens the data in dump format (Slackdump v1.1.0+) from filesystem fsys, and the given name. It will scan for file attachments.

If you need to open a dump from Slackdump pre-v1.1.0, convert it first, with the following command:

slackdump tools convertv1

Note: slackdump pre-v1.1.0 dumps do not have threads.

func (Dump) AllMessages

func (d Dump) AllMessages(_ context.Context, channelID string) (iter.Seq2[slack.Message, error], error)

func (Dump) AllThreadMessages

func (d Dump) AllThreadMessages(_ context.Context, channelID, threadID string) (iter.Seq2[slack.Message, error], error)

func (Dump) Avatars

func (d Dump) Avatars() Storage

func (Dump) ChannelInfo

func (d Dump) ChannelInfo(_ context.Context, channelID string) (*slack.Channel, error)

func (Dump) Channels

func (d Dump) Channels(context.Context) ([]slack.Channel, error)

Channels returns channels for the dump. It first tries to read the channels from the channels.json file. If that fails, it will walk the filesystem loading the channel files and extracting channel names and IDs from them.

func (Dump) Close

func (d Dump) Close() error

func (Dump) Files

func (d Dump) Files() Storage

func (Dump) Latest

func (d Dump) Latest(ctx context.Context) (map[structures.SlackLink]time.Time, error)

func (Dump) Name

func (d Dump) Name() string

func (*Dump) Sorted

func (d *Dump) Sorted(ctx context.Context, channelID string, desc bool, cb func(ts time.Time, msg *slack.Message) error) error

func (Dump) Type

func (d Dump) Type() Flags

func (Dump) Users

func (d Dump) Users(context.Context) ([]slack.User, error)

Users returns users for the dump. It first tries to read the users from the users.json file. If that fails, there's no other way for it to get users, so it will return an empty slice and a nil error. Dumps may not have user information.

func (Dump) WorkspaceInfo

func (d Dump) WorkspaceInfo(context.Context) (*slack.AuthTestResponse, error)

type Export

type Export struct {
	// contains filtered or unexported fields
}

Export implements viewer.Sourcer for the zip file Slack export format.

func OpenExport

func OpenExport(fsys fs.FS, name string) (*Export, error)

OpenExport opens a Slack export with the given name from the filesystem fsys.

func (*Export) AllMessages

func (e *Export) AllMessages(ctx context.Context, channelID string) (iter.Seq2[slack.Message, error], error)

AllMessages returns all channel messages without thread messages.

func (*Export) AllThreadMessages

func (e *Export) AllThreadMessages(ctx context.Context, channelID, threadID string) (iter.Seq2[slack.Message, error], error)

AllThreadMessages returns all thread messages for the channelID:threadID. If the thread is contained in the cache, it will iterate only through the files that contain the thread messages, otherwise it will iterate through all messages in the channel and extract the thread messages. Call [buildThreadCache] for the channelID, before calling this method to speed up search.

func (*Export) Avatars

func (e *Export) Avatars() Storage

func (*Export) ChannelInfo

func (e *Export) ChannelInfo(ctx context.Context, channelID string) (*slack.Channel, error)

func (*Export) Channels

func (e *Export) Channels(context.Context) ([]slack.Channel, error)

func (*Export) Close

func (e *Export) Close() error

func (*Export) Files

func (e *Export) Files() Storage

func (*Export) Latest

func (e *Export) Latest(ctx context.Context) (map[structures.SlackLink]time.Time, error)

func (*Export) Name

func (e *Export) Name() string

func (*Export) Sorted

func (e *Export) Sorted(ctx context.Context, channelID string, desc bool, cb func(ts time.Time, msg *slack.Message) error) error

func (*Export) Type

func (e *Export) Type() Flags

func (*Export) Users

func (e *Export) Users(context.Context) ([]slack.User, error)

func (*Export) WorkspaceInfo

func (e *Export) WorkspaceInfo(context.Context) (*slack.AuthTestResponse, error)

type Flags

type Flags int8
const (
	// container
	// FDirectory indicates that the source is a directory.
	FDirectory Flags = 1 << iota
	// FZip indicates that the source is a ZIP archive.
	FZip
	// main content
	// FChunk is set on directories with json.gz files.
	FChunk
	// FExport is set on Slack export directories or ZIP files.
	FExport
	// FDump is set on a Slackdump Dump format directory or ZIP file.
	FDump
	// FDatabase indicates that the source is a SQLite database.
	FDatabase

	// FUnknown is returned if the source type is not supported or
	// can not be determined.
	FUnknown Flags = 0
)

func Type

func Type(src string) (Flags, error)

func (Flags) Has

func (f Flags) Has(ff Flags) bool

func (Flags) String

func (f Flags) String() string

type NoStorage

type NoStorage struct{}

NoStorage is the Storage that returns fs.ErrNotExist for all files.

func (NoStorage) FS

func (NoStorage) FS() fs.FS

func (NoStorage) File

func (NoStorage) File(id string, name string) (string, error)

func (NoStorage) FilePath

func (NoStorage) FilePath(*slack.Channel, *slack.File) string

func (NoStorage) Type

func (NoStorage) Type() StorageType

type Resumer

type Resumer interface {
	// Latest should return the latest timestamps of all channels and threads.
	Latest(ctx context.Context) (map[structures.SlackLink]time.Time, error)
}

type STDump

type STDump struct {
	// contains filtered or unexported fields
}

STDump is the Storage for the dump format. Files are stored in the directories named after the channel IDs.

Directory structure:

./
  +-- <channel_id1>/
  |   +-- <file_id1>-filename.ext
  |   +-- <file_id2>-otherfile.ext
  |   +-- ...
  +-- <channel_id1>.json
  +-- <channel_id2>/
  |   +-- <file_id3>-filename.ext
  |   +-- <file_id4>-otherfile.ext
  |   +-- ...
  +-- <channel_id2>.json
  +-- ...

func NewDumpStorage

func NewDumpStorage(fsys fs.FS) (*STDump, error)

NewDumpStorage returns the file storage of the slackdumpdump format. fsys is the root of the dump.

func (*STDump) FS

func (r *STDump) FS() fs.FS

func (*STDump) File

func (r *STDump) File(id string, name string) (string, error)

func (*STDump) FilePath

func (r *STDump) FilePath(ci *slack.Channel, f *slack.File) string

func (*STDump) Type

func (r *STDump) Type() StorageType

type STMattermost

type STMattermost struct {
	// contains filtered or unexported fields
}

STMattermost is the Storage for the mattermost export format. Files are stored in the __uploads subdirectory, and the Storage is the filesystem of the __uploads directory.

Directory structure:

./__uploads/
  +-- <file_id1>/filename.ext
  +-- <file_id2>/otherfile.ext
  +-- ...

func OpenMattermostStorage

func OpenMattermostStorage(rootfs fs.FS) (*STMattermost, error)

OpenMattermostStorage returns the resolver for the mattermost export format. rootfs is the root filesystem of the export.

func (*STMattermost) FS

func (r *STMattermost) FS() fs.FS

func (*STMattermost) File

func (r *STMattermost) File(id string, name string) (string, error)

func (*STMattermost) FilePath

func (r *STMattermost) FilePath(_ *slack.Channel, f *slack.File) string

func (*STMattermost) Type

func (r *STMattermost) Type() StorageType

type STStandard

type STStandard struct {
	// contains filtered or unexported fields
}

STStandard is the Storage for the standard export format. Files are stored in the "attachments" subdirectories, and the Storage is the filesystem of the export.

Directory structure:

./
  +-- <channel_name>/
  |   +-- attachments/<file_id1>-filename.ext
  |   +-- attachments/<file_id2>-otherfile.ext
  |   +-- ...
  +-- ...

func OpenStandardStorage

func OpenStandardStorage(rootfs fs.FS) (*STStandard, error)

OpenStandardStorage returns the resolver for the export's standard storage format.

func (*STStandard) FS

func (r *STStandard) FS() fs.FS

func (*STStandard) File

func (r *STStandard) File(id string, name string) (string, error)

func (*STStandard) FilePath

func (r *STStandard) FilePath(ci *slack.Channel, f *slack.File) string

func (*STStandard) Type

func (r *STStandard) Type() StorageType

type SourceResumeCloser

type SourceResumeCloser interface {
	Sourcer
	Resumer
	io.Closer
}

SourceResumeCloser is the interface that should be implemented by sources that can be resumed.

func Load

func Load(ctx context.Context, src string) (SourceResumeCloser, error)

Load loads the source from file src. It will automatically detect the format and return the appropriate reader. It will detect any attachments and avatars if they exist in the source. It will return an error if the source type is not supported or if the source is not found.

type Sourcer

type Sourcer interface {
	// Name should return the name of the retriever underlying media, i.e.
	// directory or archive.
	Name() string
	// Type should return the flag that describes the type of the source.
	// It may not have all the flags set, but it should have at least one
	// identifying the source type.
	Type() Flags
	// Channels should return all channels.
	Channels(ctx context.Context) ([]slack.Channel, error)
	// Users should return all users.
	Users(ctx context.Context) ([]slack.User, error)
	// AllMessages should return all messages for the given channel id.  If there's no messages
	// for the channel, it should return ErrNotFound.
	AllMessages(ctx context.Context, channelID string) (iter.Seq2[slack.Message, error], error)
	// AllThreadMessages should return all messages for the given tuple
	// (channelID, threadID). It should return the parent channel message
	// (thread lead) as a first message.  If there's no messages for the
	// thread, it should return ErrNotFound.
	AllThreadMessages(ctx context.Context, channelID, threadID string) (iter.Seq2[slack.Message, error], error)
	// Sorted should iterate over all (both channel and thread) messages for
	// the requested channel id.  If desc is true, it must return messages in
	// descending order (by timestamp), otherwise in ascending order.  The
	// callback function cb should be called for each message. If cb returns an
	// error, the iteration should be stopped and the error should be returned.
	Sorted(ctx context.Context, channelID string, desc bool, cb func(ts time.Time, msg *slack.Message) error) error
	// ChannelInfo should return the channel information for the given channel
	// id.
	ChannelInfo(ctx context.Context, channelID string) (*slack.Channel, error)
	// Files should return file [Storage].
	Files() Storage
	// Avatars should return the avatar [Storage].
	Avatars() Storage
	// WorkspaceInfo should return the workspace information, if it is available.
	WorkspaceInfo(ctx context.Context) (*slack.AuthTestResponse, error)
}

Sourcer is an interface for retrieving data from different sources. If any of the methods is not supported, it should return ErrNotSupported. If any information is missing, i.e. no channels, or no data for the channel, it should return ErrNotFound.

type Storage

type Storage interface {
	// FS should return the filesystem with file attachments.
	FS() fs.FS
	// Type should return the storage type.
	Type() StorageType
	// File should return the path of the file WITHIN the filesystem returned
	// by FS().  If file is not found, it should return fs.ErrNotExist.
	File(id string, name string) (string, error)
	// FilePath should return the path to the file f relative to the root of
	// the Source (i.e., for Mattermost, __uploads/ID/Name.ext).
	FilePath(ch *slack.Channel, f *slack.File) string
}

Storage is the interface for the file storage used by the source types.

type StorageType

type StorageType uint8

StorageType is the type of storage used for the files within the source.

const (
	// STnone is the storage type for no storage.
	STnone StorageType = iota
	// STstandard is the storage type for the standard file storage.
	STstandard
	// STmattermost is the storage type for Mattermost.
	STmattermost
	// STdump is the storage type for the dump format.
	STdump
	// STAvatar is the storage type for the avatar storage.
	STAvatar
)

func (*StorageType) Func

func (e *StorageType) Func() (pathFn func(*slack.Channel, *slack.File) string, ok bool)

Func returns the "resolve path" function that returns the file path for the given channel and file. It returns false if the storage type is not recognised.

func (*StorageType) Set

func (e *StorageType) Set(v string) error

Set translates the string value into the ExportType, satisfies flag.Value interface. It is based on the declarations generated by stringer.

It is imperative that the stringer is generated prior to calling this function, if any new storage methods are added.

func (StorageType) String

func (i StorageType) String() string

Directories

Path Synopsis
Package mock_source is a generated GoMock package.
Package mock_source is a generated GoMock package.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL