Skip to content

Dataset

This class is for loading empirical Datasets. Datasets are stored as matrices and can include functional (e.g. fMRI) and structural (e.g. DTI) data.

Format

Empirical datasets are stored in the neurolib/data/datasets directory. In each dataset, subject-wise functional and structural data is stored as MATLAB .mat matrices that can be opened in Python using SciPy's loadmat function. Structural data are \(N \times N\), and functional time series are \(N \times t\) matrices, \(N\) being the number of brain regions and \(t\) the number of time steps. Example datasets are included in neurolib and custom datasets can be added by placing them in the dataset directory.

Structural DTI data

To simulate a whole-brain network model, first we need to load the structural connectivity matrices from a DTI data set. The matrices are usually a result of processing DTI data and performing fiber tractography using software like FSL or DSIStudio. The handling of the datasets is done by the Dataset class, and the attributes in the following refer to its instances. Upon initialization, the subject-wise data set is loaded from disk. For all examples in this paper, we use freely available data from the ConnectomeDB of the Human Connectome Project (HCP). For a given parcellation of the brain into \(N\) brain regions, these matrices are the \(N \times N\) adjacency matrix self.Cmat, i.e. the structural connectivity matrix, which determines the coupling strengths between brain areas, and the fiber length matrix Dmat which determines the signal transmission delays. The two example datasets currently included in neurolib use the the 80 cortical regions of the AAL2 atlas to define the brain areas and are sorted in a LRLR-ordering.

Connectivity matrix normalization

The elements of the structural connectivity matrix Cmat are typically the number of reconstructed fibers from DTI tractography. Since the number of fibers depends on the method and the parameters of the (probabilistic or deterministic) tractography, they need to be normalized using one of the three implemented methods. The first method max is to simply divide the entries of Cmat by the largest entry, such that the the largest entry becomes 1. The second method waytotal divides the entries of each column of Cmat by the number fiber tracts generated from the respective brain region during probabilistic tractography in FSL, which is contained in the waytotal.txt file. The third method nvoxel divides the entries of each column of Cmat by the size, e.g., the number of voxels of the corresponding brain area. The last two methods yield an asymmetric connectivity matrix, while the first one keeps Cmat symmetric. All normalization steps are done on the subject-wise matrices Cmats and Dmats. In a final step, all matrices can also be averaged across all subjects to yield one Cmat and Dmat per dataset.

Functional MRI data

Subject-wise fMRI time series must be in a \((N \times t)\)-dimensional format, where \(N\) is the number of brain regions and \(t\) the length of the time series. Each region-wise time series represents the BOLD activity averaged across all voxels of that region, which can be also obtained from software like FSL. Functional connectivity (FC) captures the spatial correlation structure of the BOLD time series averaged across the entire time of the recording. FC matrices are accessible via the attribute FCs and are generated by computing the Pearson correlation of the time series between all regions, yielding a \(N \times N\) matrix for each subject.

To capture the temporal fluctuations of time-dependent FC(t), which are lost when averaging across the entire recording time, functional connectivity dynamics matrices (FCDs) are computed as the element-wise Pearson correlation of time-dependent FC(t) matrices in a moving window across the BOLD time series of a chosen window length of, for example, 1 min. This yields a \(t_{FCD} \times t_{FCD}\) FCD matrix for each subject, with \(t_{FCD}\) being the number of steps the window was moved.

Source code in neurolib/utils/loadData.py
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
class Dataset:
    """
    This class is for loading empirical Datasets. Datasets are stored as matrices and can
    include functional (e.g. fMRI) and structural (e.g. DTI) data.

    ## Format

    Empirical datasets are
    stored in the `neurolib/data/datasets` directory. In each dataset, subject-wise functional
    and structural data is stored as MATLAB `.mat` matrices that can be opened in
    Python using SciPy's `loadmat` function. Structural data are
    $N \\times N$, and functional time series are $N \\times t$ matrices, $N$ being the number
    of brain regions and $t$ the number of time steps. Example datasets are included in `neurolib`
    and custom datasets can be added by placing them in the dataset directory.

    ## Structural DTI data

    To simulate a whole-brain network model, first we need to load the structural connectivity
    matrices from a DTI data set. The matrices are usually a result of processing DTI data and
    performing fiber tractography using software like *FSL* or
    *DSIStudio*. The handling of the datasets is done by the
    `Dataset` class, and the attributes in the following refer to its instances.
    Upon initialization, the subject-wise data set is loaded from disk. For all examples
    in this paper, we use freely available data from the ConnectomeDB of the
    Human Connectome Project (HCP). For a given parcellation of the brain
    into $N$ brain regions, these matrices are the $N \\times N$ adjacency matrix `self.Cmat`,
    i.e. the structural connectivity matrix, which determines the coupling strengths between
    brain areas, and the fiber length matrix `Dmat` which determines the signal
    transmission delays. The two example datasets currently included in `neurolib` use the the 80
    cortical regions of the AAL2 atlas to define the brain areas and are
    sorted in a LRLR-ordering.


    ## Connectivity matrix normalization

    The elements of the structural connectivity matrix `Cmat` are typically the number
    of reconstructed fibers from DTI tractography. Since the number of fibers depends on the
    method and the parameters of the (probabilistic or deterministic) tractography, they need to
    be normalized using one of the three implemented methods. The first method `max` is to
    simply divide the entries of  `Cmat` by the largest entry, such that the the largest
    entry becomes 1. The second method  `waytotal` divides the entries of each column of
    `Cmat` by the number fiber tracts generated from the respective brain region during
    probabilistic tractography in FSL, which is contained in the `waytotal.txt` file.
    The third method `nvoxel` divides the entries of each column of `Cmat` by the
    size, e.g., the number of voxels of the corresponding brain area. The last two methods yield
    an asymmetric connectivity matrix, while the first one keeps `Cmat` symmetric.
    All normalization steps are done on the subject-wise matrices `Cmats` and
    `Dmats`. In a final step, all matrices can also be averaged across all subjects
    to yield one `Cmat` and `Dmat` per dataset.

    ## Functional MRI data

    Subject-wise fMRI time series must be in a $(N \\times t)$-dimensional format, where $N$ is the
    number of brain regions and $t$ the length of the time series. Each region-wise time series
    represents the BOLD activity averaged across all voxels of that region, which can be also obtained
    from software like FSL. Functional connectivity (FC) captures the spatial correlation structure
    of the BOLD time series averaged across the entire time of the recording. FC matrices are
    accessible via the attribute `FCs` and are generated by computing the Pearson correlation
    of the time series between all regions, yielding a $N \\times N$ matrix for each subject.

    To capture the temporal fluctuations of time-dependent FC(t), which are lost when averaging
    across the entire recording time, functional connectivity dynamics matrices (`FCDs`) are
    computed as the element-wise Pearson correlation of time-dependent FC(t) matrices in a moving
    window across the BOLD time series of a chosen window length of, for example, 1 min. This
    yields a $t_{FCD} \\times t_{FCD}$ FCD matrix for each subject, with $t_{FCD}$ being the number
    of steps the window was moved.

    """

    def __init__(self, datasetName=None, normalizeCmats="max", fcd=False, subcortical=False):
        """
        Load the empirical data sets that are provided with `neurolib`.

        Right now, datasets work on a per-subject base. A dataset must be located
        in the `neurolib/data/datasets/` directory. Each subject's dataset
        must be in the `subjects` subdirectory of that folder. In each subject
        folder there is a directory called `functional` for time series data
        and `structural` the structural connectivity data.

        See `loadData.loadSubjectFiles()` for more details on which files are
        being loaded.

        The structural connectivity data (accessible using the attribute
        loadData.Cmat), can be normalized using the `normalizeCmats` flag.
        This defaults to "max" which normalizes the Cmat by its maxmimum.
        Other options are `waytotal` or `nvoxel`, which normalizes the
        Cmat by dividing every row of the matrix by the waytotal or
        nvoxel files that are provided in the datasets.

        Info: the waytotal.txt and the nvoxel.txt are files extracted from
        the tractography of DTI data using `probtrackX` from the `fsl` pipeline.

        Individual subject data is provided with the class attributes:
        self.BOLDs: BOLD timeseries of each individual
        self.FCs: Functional connectivity of BOLD timeseries

        Mean data is provided with the class attributes:
        self.Cmat: Structural connectivity matrix (for coupling strenghts between areas)
        self.Dmat: Fiber length matrix (for delays)
        self.BOLDs: BOLD timeseries of each area
        self.FCs: Functional connectiviy matrices of each BOLD timeseries

        :param datasetName: Name of the dataset to load
        :type datasetName: str
        :param normalizeCmats: Normalization method for the structural connectivity matrix. normalizationMethods = ["max", "waytotal", "nvoxel"]
        :type normalizeCmats: str
        :param fcd: Compute FCD matrices of BOLD data, defaults to False
        :type fcd: bool
        :param subcortical: Include subcortical areas from the atlas or not, defaults to False
        :type subcortical: bool

        """
        self.has_subjects = None
        if datasetName:
            self.loadDataset(datasetName, normalizeCmats=normalizeCmats, fcd=fcd, subcortical=subcortical)

    def loadDataset(self, datasetName, normalizeCmats="max", fcd=False, subcortical=False):
        """Load data into accessible class attributes.

        :param datasetName: Name of the dataset (must be in `datasets` directory)
        :type datasetName: str
        :param normalizeCmats: Normalization method for Cmats, defaults to "max"
        :type normalizeCmats: str, optional
        :raises NotImplementedError: If unknown normalization method is used
        """
        # the base directory of the dataset
        dsBaseDirectory = os.path.join(os.path.dirname(__file__), "..", "data", "datasets", datasetName)
        assert os.path.exists(dsBaseDirectory), f"Dataset {datasetName} not found in {dsBaseDirectory}."
        self.dsBaseDirectory = dsBaseDirectory
        self.data = dotdict({})

        # load all available subject data from disk to memory
        logging.info(f"Loading dataset {datasetName} from {self.dsBaseDirectory}.")
        self._loadSubjectFiles(self.dsBaseDirectory, subcortical=subcortical)
        assert len(self.data) > 0, "No data loaded."
        assert self.has_subjects

        self.Cmats = self._normalizeCmats(self.getDataPerSubject("cm"), method=normalizeCmats)
        self.Dmats = self.getDataPerSubject("len")

        # take the average of all
        self.Cmat = np.mean(self.Cmats, axis=0)

        self.Dmat = self.getDataPerSubject(
            "len",
            apply="all",
            apply_function=np.mean,
            apply_function_kwargs={"axis": 0},
        )
        self.BOLDs = self.getDataPerSubject("bold")
        self.FCs = self.getDataPerSubject("bold", apply_function=func.fc)

        if fcd:
            self.computeFCD()

        logging.info(f"Dataset {datasetName} loaded.")

    def computeFCD(self):
        logging.info("Computing FCD matrices ...")
        self.FCDs = self.getDataPerSubject("bold", apply_function=func.fcd, apply_function_kwargs={"stepsize": 10})

    def getDataPerSubject(
        self,
        name,
        apply="single",
        apply_function=None,
        apply_function_kwargs={},
        normalizeCmats="max",
    ):
        """Load data of a certain kind for all users of the current dataset

        :param name: Name of data type, i.e. "bold" or "cm"
        :type name: str
        :param apply: Apply function per subject ("single") or on all subjects ("all"), defaults to "single"
        :type apply: str, optional
        :param apply_function: Apply function on data, defaults to None
        :type apply_function: function, optional
        :param apply_function_kwargs: Keyword arguments of fuction, defaults to {}
        :type apply_function_kwargs: dict, optional
        :return: Subjectwise data, after function apply
        :rtype: list[np.ndarray]
        """
        values = []
        for subject, value in self.data["subjects"].items():
            assert name in value, f"Data type {name} not found in dataset of subject {subject}."
            val = value[name]
            if apply_function and apply == "single":
                val = apply_function(val, **apply_function_kwargs)
            values.append(val)

        if apply_function and apply == "all":
            values = apply_function(values, **apply_function_kwargs)
        return values

    def _normalizeCmats(self, Cmats, method="max", FSL_SAMPLES_PER_VOXEL=5000):
        # normalize per subject data
        normalizationMethods = [None, "max", "waytotal", "nvoxel"]
        if method not in normalizationMethods:
            raise NotImplementedError(
                f'"{method}" is not a known normalization method. Use one of these: {normalizationMethods}'
            )
        if method == "max":
            Cmats = [cm / np.max(cm) for cm in Cmats]
        elif method == "waytotal":
            self.waytotal = self.getDataPerSubject("waytotal")
            Cmats = [cm / wt for cm, wt in zip(Cmats, self.waytotal)]
        elif method == "nvoxel":
            self.nvoxel = self.getDataPerSubject("nvoxel")
            Cmats = [cm / (nv[:, 0] * FSL_SAMPLES_PER_VOXEL) for cm, nv in zip(Cmats, self.nvoxel)]
        return Cmats

    def _loadSubjectFiles(self, dsBaseDirectory, subcortical=False):
        """Dirty subject-wise file loader. Depends on the exact naming of all
        files as provided in the `neurolib/data/datasets` directory. Uses `glob.glob()`
        to find all files based on hardcoded file name matching.

        Can filter out subcortical regions from the AAL2 atlas.

        Info: Dirty implementation that assumes a lot of things about the dataset and filenames.

        :param dsBaseDirectory: Base directory of the dataset
        :type dsBaseDirectory: str
        :param subcortical: Filter subcortical regions from files defined by the AAL2 atlas, defaults to False
        :type subcortical: bool, optional
        """
        # check if there are subject files in the dataset
        if os.path.exists(os.path.join(dsBaseDirectory, "subjects")):
            self.has_subjects = True
            self.data["subjects"] = {}

            # data type paths, glob strings, dirty
            BOLD_paths_glob = os.path.join(dsBaseDirectory, "subjects", "*", "functional", "*rsfMRI*.mat")
            CM_paths_glob = os.path.join(dsBaseDirectory, "subjects", "*", "structural", "DTI_CM*.mat")
            LEN_paths_glob = os.path.join(dsBaseDirectory, "subjects", "*", "structural", "DTI_LEN*.mat")
            WAY_paths_glob = os.path.join(dsBaseDirectory, "subjects", "*", "structural", "waytotal*.txt")
            NVOXEL_paths_glob = os.path.join(dsBaseDirectory, "subjects", "*", "structural", "nvoxel*.txt")

            _ftypes = {
                "bold": BOLD_paths_glob,
                "cm": CM_paths_glob,
                "len": LEN_paths_glob,
                "waytotal": WAY_paths_glob,
                "nvoxel": NVOXEL_paths_glob,
            }

            for _name, _glob in _ftypes.items():
                fnames = glob.glob(_glob)
                # if there is none of this data type
                if len(fnames) == 0:
                    continue
                for f in fnames:
                    # dirty
                    subject = f.split(os.path.sep)[-3]
                    # create subject in dict if not present yet
                    if not subject in self.data["subjects"]:
                        self.data["subjects"][subject] = {}

                    # if the data for this type is not already loaded
                    if _name not in self.data["subjects"][subject]:
                        # bold, cm and len matrixes are provided as .mat files
                        if _name in ["bold", "cm", "len"]:
                            filter_subcotrical_axis = "both"
                            if _name == "bold":
                                key = "tc"
                                filter_subcotrical_axis = 0
                            elif _name == "cm":
                                key = "sc"
                            elif _name == "len":
                                key = "len"
                            # load the data
                            data = self.loadMatrix(f, key=key)
                            if not subcortical:
                                data = filterSubcortical(data, axis=filter_subcotrical_axis)
                            self.data["subjects"][subject][_name] = data
                        # waytotal and nvoxel files are .txt files
                        elif _name in ["waytotal", "nvoxel"]:
                            data = np.loadtxt(f)
                            if not subcortical:
                                data = filterSubcortical(data, axis=0)
                            self.data["subjects"][subject][_name] = data

    def loadMatrix(self, matFileName, key="", verbose=False):
        """Function to furiously load .mat files with scipy.io.loadmat.
        Info: More formats are supported but commented out in the code.

        :param matFileName: Filename of matrix to load
        :type matFileName: str
        :param key: .mat file key in which data is stored (example: "sc")
        :type key: str

        :return: Loaded matrix
        :rtype: numpy.ndarray
        """
        if verbose:
            print(f"Loading {matFileName}")
        matrix = scipy.io.loadmat(matFileName)
        if verbose:
            print("\tLoading using scipy.io.loadmat...")
            print(f"Keys: {list(matrix.keys())}")
        if key != "" and key in list(matrix.keys()):
            matrix = matrix[key]
            if verbose:
                print(f'\tLoaded key "{key}"')
        elif type(matrix) is dict:
            raise ValueError(f"Object is still a dict. Here are the keys: {matrix.keys()}")
        return matrix
        return 0

__init__(datasetName=None, normalizeCmats='max', fcd=False, subcortical=False)

Load the empirical data sets that are provided with neurolib.

Right now, datasets work on a per-subject base. A dataset must be located in the neurolib/data/datasets/ directory. Each subject's dataset must be in the subjects subdirectory of that folder. In each subject folder there is a directory called functional for time series data and structural the structural connectivity data.

See loadData.loadSubjectFiles() for more details on which files are being loaded.

The structural connectivity data (accessible using the attribute loadData.Cmat), can be normalized using the normalizeCmats flag. This defaults to "max" which normalizes the Cmat by its maxmimum. Other options are waytotal or nvoxel, which normalizes the Cmat by dividing every row of the matrix by the waytotal or nvoxel files that are provided in the datasets.

Info: the waytotal.txt and the nvoxel.txt are files extracted from the tractography of DTI data using probtrackX from the fsl pipeline.

Individual subject data is provided with the class attributes: self.BOLDs: BOLD timeseries of each individual self.FCs: Functional connectivity of BOLD timeseries

Mean data is provided with the class attributes: self.Cmat: Structural connectivity matrix (for coupling strenghts between areas) self.Dmat: Fiber length matrix (for delays) self.BOLDs: BOLD timeseries of each area self.FCs: Functional connectiviy matrices of each BOLD timeseries

Parameters:

Name Type Description Default
datasetName str

Name of the dataset to load

None
normalizeCmats str

Normalization method for the structural connectivity matrix. normalizationMethods = ["max", "waytotal", "nvoxel"]

'max'
fcd bool

Compute FCD matrices of BOLD data, defaults to False

False
subcortical bool

Include subcortical areas from the atlas or not, defaults to False

False
Source code in neurolib/utils/loadData.py
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
def __init__(self, datasetName=None, normalizeCmats="max", fcd=False, subcortical=False):
    """
    Load the empirical data sets that are provided with `neurolib`.

    Right now, datasets work on a per-subject base. A dataset must be located
    in the `neurolib/data/datasets/` directory. Each subject's dataset
    must be in the `subjects` subdirectory of that folder. In each subject
    folder there is a directory called `functional` for time series data
    and `structural` the structural connectivity data.

    See `loadData.loadSubjectFiles()` for more details on which files are
    being loaded.

    The structural connectivity data (accessible using the attribute
    loadData.Cmat), can be normalized using the `normalizeCmats` flag.
    This defaults to "max" which normalizes the Cmat by its maxmimum.
    Other options are `waytotal` or `nvoxel`, which normalizes the
    Cmat by dividing every row of the matrix by the waytotal or
    nvoxel files that are provided in the datasets.

    Info: the waytotal.txt and the nvoxel.txt are files extracted from
    the tractography of DTI data using `probtrackX` from the `fsl` pipeline.

    Individual subject data is provided with the class attributes:
    self.BOLDs: BOLD timeseries of each individual
    self.FCs: Functional connectivity of BOLD timeseries

    Mean data is provided with the class attributes:
    self.Cmat: Structural connectivity matrix (for coupling strenghts between areas)
    self.Dmat: Fiber length matrix (for delays)
    self.BOLDs: BOLD timeseries of each area
    self.FCs: Functional connectiviy matrices of each BOLD timeseries

    :param datasetName: Name of the dataset to load
    :type datasetName: str
    :param normalizeCmats: Normalization method for the structural connectivity matrix. normalizationMethods = ["max", "waytotal", "nvoxel"]
    :type normalizeCmats: str
    :param fcd: Compute FCD matrices of BOLD data, defaults to False
    :type fcd: bool
    :param subcortical: Include subcortical areas from the atlas or not, defaults to False
    :type subcortical: bool

    """
    self.has_subjects = None
    if datasetName:
        self.loadDataset(datasetName, normalizeCmats=normalizeCmats, fcd=fcd, subcortical=subcortical)

getDataPerSubject(name, apply='single', apply_function=None, apply_function_kwargs={}, normalizeCmats='max')

Load data of a certain kind for all users of the current dataset

Parameters:

Name Type Description Default
name str

Name of data type, i.e. "bold" or "cm"

required
apply str, optional

Apply function per subject ("single") or on all subjects ("all"), defaults to "single"

'single'
apply_function function, optional

Apply function on data, defaults to None

None
apply_function_kwargs dict, optional

Keyword arguments of fuction, defaults to {}

{}

Returns:

Type Description
list[np.ndarray]

Subjectwise data, after function apply

Source code in neurolib/utils/loadData.py
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
def getDataPerSubject(
    self,
    name,
    apply="single",
    apply_function=None,
    apply_function_kwargs={},
    normalizeCmats="max",
):
    """Load data of a certain kind for all users of the current dataset

    :param name: Name of data type, i.e. "bold" or "cm"
    :type name: str
    :param apply: Apply function per subject ("single") or on all subjects ("all"), defaults to "single"
    :type apply: str, optional
    :param apply_function: Apply function on data, defaults to None
    :type apply_function: function, optional
    :param apply_function_kwargs: Keyword arguments of fuction, defaults to {}
    :type apply_function_kwargs: dict, optional
    :return: Subjectwise data, after function apply
    :rtype: list[np.ndarray]
    """
    values = []
    for subject, value in self.data["subjects"].items():
        assert name in value, f"Data type {name} not found in dataset of subject {subject}."
        val = value[name]
        if apply_function and apply == "single":
            val = apply_function(val, **apply_function_kwargs)
        values.append(val)

    if apply_function and apply == "all":
        values = apply_function(values, **apply_function_kwargs)
    return values

loadDataset(datasetName, normalizeCmats='max', fcd=False, subcortical=False)

Load data into accessible class attributes.

Parameters:

Name Type Description Default
datasetName str

Name of the dataset (must be in datasets directory)

required
normalizeCmats str, optional

Normalization method for Cmats, defaults to "max"

'max'

Raises:

Type Description
NotImplementedError

If unknown normalization method is used

Source code in neurolib/utils/loadData.py
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
def loadDataset(self, datasetName, normalizeCmats="max", fcd=False, subcortical=False):
    """Load data into accessible class attributes.

    :param datasetName: Name of the dataset (must be in `datasets` directory)
    :type datasetName: str
    :param normalizeCmats: Normalization method for Cmats, defaults to "max"
    :type normalizeCmats: str, optional
    :raises NotImplementedError: If unknown normalization method is used
    """
    # the base directory of the dataset
    dsBaseDirectory = os.path.join(os.path.dirname(__file__), "..", "data", "datasets", datasetName)
    assert os.path.exists(dsBaseDirectory), f"Dataset {datasetName} not found in {dsBaseDirectory}."
    self.dsBaseDirectory = dsBaseDirectory
    self.data = dotdict({})

    # load all available subject data from disk to memory
    logging.info(f"Loading dataset {datasetName} from {self.dsBaseDirectory}.")
    self._loadSubjectFiles(self.dsBaseDirectory, subcortical=subcortical)
    assert len(self.data) > 0, "No data loaded."
    assert self.has_subjects

    self.Cmats = self._normalizeCmats(self.getDataPerSubject("cm"), method=normalizeCmats)
    self.Dmats = self.getDataPerSubject("len")

    # take the average of all
    self.Cmat = np.mean(self.Cmats, axis=0)

    self.Dmat = self.getDataPerSubject(
        "len",
        apply="all",
        apply_function=np.mean,
        apply_function_kwargs={"axis": 0},
    )
    self.BOLDs = self.getDataPerSubject("bold")
    self.FCs = self.getDataPerSubject("bold", apply_function=func.fc)

    if fcd:
        self.computeFCD()

    logging.info(f"Dataset {datasetName} loaded.")

loadMatrix(matFileName, key='', verbose=False)

Function to furiously load .mat files with scipy.io.loadmat. Info: More formats are supported but commented out in the code.

Parameters:

Name Type Description Default
matFileName str

Filename of matrix to load

required
key str

.mat file key in which data is stored (example: "sc")

''

Returns:

Type Description
numpy.ndarray

Loaded matrix

Source code in neurolib/utils/loadData.py
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
def loadMatrix(self, matFileName, key="", verbose=False):
    """Function to furiously load .mat files with scipy.io.loadmat.
    Info: More formats are supported but commented out in the code.

    :param matFileName: Filename of matrix to load
    :type matFileName: str
    :param key: .mat file key in which data is stored (example: "sc")
    :type key: str

    :return: Loaded matrix
    :rtype: numpy.ndarray
    """
    if verbose:
        print(f"Loading {matFileName}")
    matrix = scipy.io.loadmat(matFileName)
    if verbose:
        print("\tLoading using scipy.io.loadmat...")
        print(f"Keys: {list(matrix.keys())}")
    if key != "" and key in list(matrix.keys()):
        matrix = matrix[key]
        if verbose:
            print(f'\tLoaded key "{key}"')
    elif type(matrix) is dict:
        raise ValueError(f"Object is still a dict. Here are the keys: {matrix.keys()}")
    return matrix
    return 0