Privacy Statement

Best intent

SADiLaR aims to serve the research community by providing a platform for the responsible sharing of research data. SADiLaR expects all individuals submitting data to our repository to act not just in good faith, but in the best faith. This means that we expect individuals to adhere to the highest ethical standards, including an ethic of care. Additionally, we pledge to act in best faith ourselves.

We also align our data practices to existing legislation, including the Protection of Personal Information Act (POPIA) of 2013.

Given that errors in the application of such safeguards are possible, we welcome constructive criticism and have an established recourse in the event of errors.

Recourse

Users of the services or resources offered by SADiLaR are encouraged to contact SADiLaR with any questions, suggestions, or concerns regarding the data we host. We will promptly address any issues regarding privacy violations or any suggestions for improvements.
Please send an email to info@sadilar.org to report any issues, provide feedback, or to describe any possible privacy issues. We value your feedback, and by working together, we can strive toward trusted and reliable services and resources.

How data is prepared

Privacy concerns do not necessarily require specific preparations for all resources or datasets. For some resources and datasets we take specific care, which we summarise below.

South African Multilingual Learner Corpus of Academic Texts (SAMuLCAT)

Students at collaborating institutions can indicate their consent for their data to be used in this project through an online form that also collects the demographic information required for the research. Without a student’s consent, the associated data is excluded from the SAMuLCAT corpus.

A number of precautions are taken to anonymise included texts:

  • Automated filtering removes any paragraph deemed likely to contain personal information. The cases detected, include where the paragraph contains:
    • an e-mail address
    • a sequence of numbers that look like a phone number or student number
    • a sentence of title-case words or words starting with numbers or symbols or tagged as nouns
  • Manual spot checks review the data for any obvious problems.