Documentation
¶
Overview ¶
Package source provides archive readers for different output formats.
Currently, the following formats are supported:
- archive
- database
- dump
- Slack Export
One should use Load function to load the source from the file system. It will automatically detect the format and return the appropriate reader.
All sources implement the Sourcer interface, which provides methods to retrieve data from the source. The Resumer interface is implemented by sources that can be resumed. The SourceResumeCloser interface is implemented by sources that can be resumed and closed.
There are two distinct error types of interest, ErrNotFound and ErrNotSupported. See their documentation for more information.
What is a Source? ¶
A source is a generic interface over the data source. It can be implemented not only around Slack data, but also around any other messenger data, as it represents common entities every messenger has, for example "message", "thread", "channel", for Telegram would be "message", "reply", "chat" respectively. The caveat is that non-Slack source's entities would need to be converted to Slack entities, which shouldn't be hard to do, unless you're aiming to replicate formatting.
In this package, for now, the source is implemented only for data originating from Slackdump and Slack.
Loading a Source ¶
If you know what source you are loading, or need a particular source type you can use a concrete Open* function, such as OpenDatabase. If you don't, you can use Load function that will determine the source type and return the appropriate source.
It is a good idea to defer the closing of the source, as it may be a database connection or a file handle. The SourceResumeCloser interface, returned by Load implements io.Closer interface, so you can use it with defer statement. For example:
src, err := source.Load(ctx, "path/to/source")
if err != nil {
log.Fatal(err)
}
defer src.Close()
Within your code, you can call the Type method which will return the source type.
Source Types ¶
The source type returned by Type is a bitmask, you can use [Has] method to check if particular flag is set. Flag constants all start with F* and have Flags type.
The source type returned by Type method on a particular source will only have the type flag set, without any additional flags (as of v3.1.0, but this may change in future versions, so always use Has method to be on the safe side).
Index ¶
- Constants
- Variables
- func AvatarParams(u *slack.User) (userID string, filename string)
- func DumpFilepath(ci *slack.Channel, f *slack.File) string
- func ExportChanName(ch *slack.Channel) string
- func MattermostFilepath(_ *slack.Channel, f *slack.File) string
- func MattermostFilepathWithDir(dir string) func(*slack.Channel, *slack.File) string
- func SanitizeFilename(name string) string
- func StdFilepath(ci *slack.Channel, f *slack.File) string
- type AvatarStorage
- type ChunkDir
- func (c *ChunkDir) AllMessages(ctx context.Context, channelID string) (iter.Seq2[slack.Message, error], error)
- func (c *ChunkDir) AllThreadMessages(ctx context.Context, channelID, threadID string) (iter.Seq2[slack.Message, error], error)
- func (c *ChunkDir) Avatars() Storage
- func (c *ChunkDir) ChannelInfo(_ context.Context, channelID string) (*slack.Channel, error)
- func (c *ChunkDir) Channels(ctx context.Context) ([]slack.Channel, error)
- func (c *ChunkDir) Close() error
- func (c *ChunkDir) Files() Storage
- func (c *ChunkDir) Latest(ctx context.Context) (map[structures.SlackLink]time.Time, error)
- func (c *ChunkDir) Name() string
- func (c *ChunkDir) Sorted(ctx context.Context, id string, desc bool, ...) error
- func (c *ChunkDir) ToChunk(ctx context.Context, enc chunk.Encoder, _ int64) error
- func (c *ChunkDir) Type() Flags
- func (c *ChunkDir) Users(context.Context) ([]slack.User, error)
- func (c *ChunkDir) WorkspaceInfo(context.Context) (*slack.AuthTestResponse, error)
- type Database
- type Dump
- func (d Dump) AllMessages(_ context.Context, channelID string) (iter.Seq2[slack.Message, error], error)
- func (d Dump) AllThreadMessages(_ context.Context, channelID, threadID string) (iter.Seq2[slack.Message, error], error)
- func (d Dump) Avatars() Storage
- func (d Dump) ChannelInfo(_ context.Context, channelID string) (*slack.Channel, error)
- func (d Dump) Channels(context.Context) ([]slack.Channel, error)
- func (d Dump) Close() error
- func (d Dump) Files() Storage
- func (d Dump) Latest(ctx context.Context) (map[structures.SlackLink]time.Time, error)
- func (d Dump) Name() string
- func (d *Dump) Sorted(ctx context.Context, channelID string, desc bool, ...) error
- func (d Dump) Type() Flags
- func (d Dump) Users(context.Context) ([]slack.User, error)
- func (d Dump) WorkspaceInfo(context.Context) (*slack.AuthTestResponse, error)
- type Export
- func (e *Export) AllMessages(ctx context.Context, channelID string) (iter.Seq2[slack.Message, error], error)
- func (e *Export) AllThreadMessages(ctx context.Context, channelID, threadID string) (iter.Seq2[slack.Message, error], error)
- func (e *Export) Avatars() Storage
- func (e *Export) ChannelInfo(ctx context.Context, channelID string) (*slack.Channel, error)
- func (e *Export) Channels(context.Context) ([]slack.Channel, error)
- func (e *Export) Close() error
- func (e *Export) Files() Storage
- func (e *Export) Latest(ctx context.Context) (map[structures.SlackLink]time.Time, error)
- func (e *Export) Name() string
- func (e *Export) Sorted(ctx context.Context, channelID string, desc bool, ...) error
- func (e *Export) Type() Flags
- func (e *Export) Users(context.Context) ([]slack.User, error)
- func (e *Export) WorkspaceInfo(context.Context) (*slack.AuthTestResponse, error)
- type Flags
- type NoStorage
- type Resumer
- type STDump
- type STMattermost
- type STStandard
- type SourceResumeCloser
- type Sourcer
- type Storage
- type StorageType
Constants ¶
const DefaultDBFile = "slackdump.sqlite"
Variables ¶
var ( // ErrNotSupported is returned if the method is not supported. ErrNotSupported = errors.New("method not supported") // ErrNotFound is returned if the data is missing or not found. ErrNotFound = errors.New("no data found") )
var ErrUnknownLinkType = errors.New("unknown link type")
Functions ¶
func AvatarParams ¶
AvatarParams is a convenience function that returns the user ID and the base name of the original avatar filename to be passed to AvatarStorage.File function. For example:
var as *AvatarStorage var u *slack.User fmt.Println(as.File(AvatarParams(u)))
func DumpFilepath ¶
DumpFilepath returns the path to the file within the channel directory.
func ExportChanName ¶
ExportChanName returns the channel name, or the channel ID if it is a DM.
func MattermostFilepath ¶
MattermostFilepath returns the path to the file within the __uploads directory.
func MattermostFilepathWithDir ¶
MattermostFilepathWithDir returns the path to the file within the given directory, but it follows the mattermost naming pattern. In most cases you don't need to use this function.
func SanitizeFilename ¶
SanitizeFilename ensures the filename is safe for all OSes, especially Windows.
Types ¶
type AvatarStorage ¶
type AvatarStorage struct {
// contains filtered or unexported fields
}
func NewAvatarStorage ¶
func NewAvatarStorage(fsys fs.FS) (*AvatarStorage, error)
func (*AvatarStorage) FS ¶
func (r *AvatarStorage) FS() fs.FS
func (*AvatarStorage) File ¶
func (r *AvatarStorage) File(userID string, imageOriginalBase string) (string, error)
func (*AvatarStorage) Type ¶
func (r *AvatarStorage) Type() StorageType
type ChunkDir ¶
type ChunkDir struct {
// contains filtered or unexported fields
}
ChunkDir is the chunk directory source.
TODO: create an index of entries, otherwise it does the full scan of the directory.
func OpenChunkDir ¶
OpenChunkDir creates a new ChurkDir source. It expects the attachments to be in the mattermost storage format. If the attachments are not in the mattermost storage format, it will assume they were not downloaded.
func (*ChunkDir) AllMessages ¶
func (c *ChunkDir) AllMessages(ctx context.Context, channelID string) (iter.Seq2[slack.Message, error], error)
AllMessages returns all messages for the channel. Current restriction - it expects for all messages for the requested file to be in the file ID.json.gz. If messages for the channel are scattered across multiple file, it will not return all of them.
func (*ChunkDir) AllThreadMessages ¶
func (*ChunkDir) ChannelInfo ¶
ChannelInfo accepts the fileID (so it can treat channel or thread exports equally). If in doubt, use channelID as the fileID.
func (*ChunkDir) WorkspaceInfo ¶
type Database ¶
Database represents a database source. It implements the Sourcer interface and provides access to the database data. It also provides access to the files and avatars storage, if available. The database source is created by calling OpenDatabase function.
func DatabaseWithSource ¶
DatabaseWithSource returns a new database source with the given database processor source. It will not have any files or avatars storage. In most cases you should use OpenDatabase instead, unless you know what you are doing.
func OpenDatabase ¶
OpenDatabase attempts to open the database at given path. It supports both types - when database file is given directly, and when the path is a directory containing the "slackdump.sqlite" file. In the latter case, it will also attempt to open the mattermost storage, and if no storage is found, it will return a special NoStorage type, which returns fs.ErrNotExist for all file operations.
func (*Database) WorkspaceInfo ¶
type Dump ¶
type Dump struct {
// contains filtered or unexported fields
}
func OpenDump ¶
OpenDump opens the data in dump format (Slackdump v1.1.0+) from filesystem fsys, and the given name. It will scan for file attachments.
If you need to open a dump from Slackdump pre-v1.1.0, convert it first, with the following command:
slackdump tools convertv1
Note: slackdump pre-v1.1.0 dumps do not have threads.
func (Dump) AllMessages ¶
func (Dump) AllThreadMessages ¶
func (Dump) ChannelInfo ¶
func (Dump) Channels ¶
Channels returns channels for the dump. It first tries to read the channels from the channels.json file. If that fails, it will walk the filesystem loading the channel files and extracting channel names and IDs from them.
func (Dump) Users ¶
Users returns users for the dump. It first tries to read the users from the users.json file. If that fails, there's no other way for it to get users, so it will return an empty slice and a nil error. Dumps may not have user information.
func (Dump) WorkspaceInfo ¶
type Export ¶
type Export struct {
// contains filtered or unexported fields
}
Export implements viewer.Sourcer for the zip file Slack export format.
func OpenExport ¶
OpenExport opens a Slack export with the given name from the filesystem fsys.
func (*Export) AllMessages ¶
func (e *Export) AllMessages(ctx context.Context, channelID string) (iter.Seq2[slack.Message, error], error)
AllMessages returns all channel messages without thread messages.
func (*Export) AllThreadMessages ¶
func (e *Export) AllThreadMessages(ctx context.Context, channelID, threadID string) (iter.Seq2[slack.Message, error], error)
AllThreadMessages returns all thread messages for the channelID:threadID. If the thread is contained in the cache, it will iterate only through the files that contain the thread messages, otherwise it will iterate through all messages in the channel and extract the thread messages. Call [buildThreadCache] for the channelID, before calling this method to speed up search.
func (*Export) ChannelInfo ¶
func (*Export) WorkspaceInfo ¶
type Flags ¶
type Flags int8
const ( // container // FDirectory indicates that the source is a directory. FDirectory Flags = 1 << iota // FZip indicates that the source is a ZIP archive. FZip // main content // FChunk is set on directories with json.gz files. FChunk // FExport is set on Slack export directories or ZIP files. FExport // FDump is set on a Slackdump Dump format directory or ZIP file. FDump // FDatabase indicates that the source is a SQLite database. FDatabase // FUnknown is returned if the source type is not supported or // can not be determined. FUnknown Flags = 0 )
type NoStorage ¶
type NoStorage struct{}
NoStorage is the Storage that returns fs.ErrNotExist for all files.
func (NoStorage) Type ¶
func (NoStorage) Type() StorageType
type STDump ¶
type STDump struct {
// contains filtered or unexported fields
}
STDump is the Storage for the dump format. Files are stored in the directories named after the channel IDs.
Directory structure:
./ +-- <channel_id1>/ | +-- <file_id1>-filename.ext | +-- <file_id2>-otherfile.ext | +-- ... +-- <channel_id1>.json +-- <channel_id2>/ | +-- <file_id3>-filename.ext | +-- <file_id4>-otherfile.ext | +-- ... +-- <channel_id2>.json +-- ...
func NewDumpStorage ¶
NewDumpStorage returns the file storage of the slackdumpdump format. fsys is the root of the dump.
func (*STDump) Type ¶
func (r *STDump) Type() StorageType
type STMattermost ¶
type STMattermost struct {
// contains filtered or unexported fields
}
STMattermost is the Storage for the mattermost export format. Files are stored in the __uploads subdirectory, and the Storage is the filesystem of the __uploads directory.
Directory structure:
./__uploads/ +-- <file_id1>/filename.ext +-- <file_id2>/otherfile.ext +-- ...
func OpenMattermostStorage ¶
func OpenMattermostStorage(rootfs fs.FS) (*STMattermost, error)
OpenMattermostStorage returns the resolver for the mattermost export format. rootfs is the root filesystem of the export.
func (*STMattermost) FS ¶
func (r *STMattermost) FS() fs.FS
func (*STMattermost) Type ¶
func (r *STMattermost) Type() StorageType
type STStandard ¶
type STStandard struct {
// contains filtered or unexported fields
}
STStandard is the Storage for the standard export format. Files are stored in the "attachments" subdirectories, and the Storage is the filesystem of the export.
Directory structure:
./ +-- <channel_name>/ | +-- attachments/<file_id1>-filename.ext | +-- attachments/<file_id2>-otherfile.ext | +-- ... +-- ...
func OpenStandardStorage ¶
func OpenStandardStorage(rootfs fs.FS) (*STStandard, error)
OpenStandardStorage returns the resolver for the export's standard storage format.
func (*STStandard) FS ¶
func (r *STStandard) FS() fs.FS
func (*STStandard) Type ¶
func (r *STStandard) Type() StorageType
type SourceResumeCloser ¶
SourceResumeCloser is the interface that should be implemented by sources that can be resumed.
func Load ¶
func Load(ctx context.Context, src string) (SourceResumeCloser, error)
Load loads the source from file src. It will automatically detect the format and return the appropriate reader. It will detect any attachments and avatars if they exist in the source. It will return an error if the source type is not supported or if the source is not found.
type Sourcer ¶
type Sourcer interface {
// Name should return the name of the retriever underlying media, i.e.
// directory or archive.
Name() string
// Type should return the flag that describes the type of the source.
// It may not have all the flags set, but it should have at least one
// identifying the source type.
Type() Flags
// Channels should return all channels.
Channels(ctx context.Context) ([]slack.Channel, error)
// Users should return all users.
Users(ctx context.Context) ([]slack.User, error)
// AllMessages should return all messages for the given channel id. If there's no messages
// for the channel, it should return ErrNotFound.
AllMessages(ctx context.Context, channelID string) (iter.Seq2[slack.Message, error], error)
// AllThreadMessages should return all messages for the given tuple
// (channelID, threadID). It should return the parent channel message
// (thread lead) as a first message. If there's no messages for the
// thread, it should return ErrNotFound.
AllThreadMessages(ctx context.Context, channelID, threadID string) (iter.Seq2[slack.Message, error], error)
// Sorted should iterate over all (both channel and thread) messages for
// the requested channel id. If desc is true, it must return messages in
// descending order (by timestamp), otherwise in ascending order. The
// callback function cb should be called for each message. If cb returns an
// error, the iteration should be stopped and the error should be returned.
Sorted(ctx context.Context, channelID string, desc bool, cb func(ts time.Time, msg *slack.Message) error) error
// ChannelInfo should return the channel information for the given channel
// id.
ChannelInfo(ctx context.Context, channelID string) (*slack.Channel, error)
// Files should return file [Storage].
Files() Storage
// Avatars should return the avatar [Storage].
Avatars() Storage
// WorkspaceInfo should return the workspace information, if it is available.
WorkspaceInfo(ctx context.Context) (*slack.AuthTestResponse, error)
}
Sourcer is an interface for retrieving data from different sources. If any of the methods is not supported, it should return ErrNotSupported. If any information is missing, i.e. no channels, or no data for the channel, it should return ErrNotFound.
type Storage ¶
type Storage interface {
// FS should return the filesystem with file attachments.
FS() fs.FS
// Type should return the storage type.
Type() StorageType
// File should return the path of the file WITHIN the filesystem returned
// by FS(). If file is not found, it should return fs.ErrNotExist.
File(id string, name string) (string, error)
// FilePath should return the path to the file f relative to the root of
// the Source (i.e., for Mattermost, __uploads/ID/Name.ext).
FilePath(ch *slack.Channel, f *slack.File) string
}
Storage is the interface for the file storage used by the source types.
type StorageType ¶
type StorageType uint8
StorageType is the type of storage used for the files within the source.
const ( // STnone is the storage type for no storage. STnone StorageType = iota // STstandard is the storage type for the standard file storage. STstandard // STmattermost is the storage type for Mattermost. STmattermost // STdump is the storage type for the dump format. STdump // STAvatar is the storage type for the avatar storage. STAvatar )
func (*StorageType) Func ¶
Func returns the "resolve path" function that returns the file path for the given channel and file. It returns false if the storage type is not recognised.
func (*StorageType) Set ¶
func (e *StorageType) Set(v string) error
Set translates the string value into the ExportType, satisfies flag.Value interface. It is based on the declarations generated by stringer.
It is imperative that the stringer is generated prior to calling this function, if any new storage methods are added.
func (StorageType) String ¶
func (i StorageType) String() string
Source Files
¶
Directories
¶
| Path | Synopsis |
|---|---|
|
Package mock_source is a generated GoMock package.
|
Package mock_source is a generated GoMock package. |