ETHICS AT MYRA

Assisting Researchers,

Not Replacing Them

Why MyRA is a trusted solution for researchers

Reduced Risk of AI Hallucinations

Through our backend pre- and post-processing, we leverage GenAI in a controlled manner that minimizes the risk of 'hallucination' or incorrect information. Our platform focuses on extracting proposed themes directly from the text uploaded by users. Each identified theme is linked to supporting quotes and sentences from the source material, allowing for straightforward verification. This ensures that the findings are reliable and can be easily cross-checked against the original data, maintaining the integrity and accuracy of your research.

Ethical and Secure by Design

We prioritize the ethical use of AI in research by maintaining strict data security measures and ensuring that researchers remain at the heart of the analysis process. MyRA acts as an assistant, providing a head start on data analysis while preserving the critical role of human judgment and interpretation. We continuously review the latest advances in AI as well as findings from research on AI and qualitative research to improve our product.

Overcoming black-boxed performance through transparency

There is a lack of explainability behind LLM predictions. We've presented the findings in a way that allows researchers to easily check the AI's work. By providing themes along with the associated quotes for each uploaded interview, we give you the opportunity to easily verify that the theme makes sense for the given quote. This process supports research integrity by ensuring that every insight is directly traceable to the original source material.

WHY YOU NEED MYRA

Justifying the use of MyRA in research

MyRA Justification Table
Check for missed themes MyRA helps ensure comprehensive analysis by identifying themes that may have been overlooked by humans (Jalali et al., 2024; Wachinger et al., 2024).
Reduce certain biases MyRA can reduce certain human biases in analysis, while also acknowledging and addressing the introduction of new AI-related biases (Jalali et al., 2024).
Rapid processing MyRA can process large volumes of information quickly, saving valuable time.
Consistent analysis MyRA provides more consistent and tailored expert analysis, compared to other AI tools like ChatGPT.

WHY YOU CAN USE MYRA

Ensuring safe and ethical use of GenAI in research

MyRA Justification Table
Secure MyRA employs robust data security measures to keep user data safe.
Ethical MyRA ensures users are informed about potential biases and limitations of AI and must acknowledge them before using the platform.
Reliable MyRA anchors themes in direct quotes from the data, reducing the risk of AI hallucinations to improve reliability.
Transparent MyRA presents information transparently, allowing researchers to easily verify and validate the findings.

Read our guidance for ethics boards, journals, publishers and reviewers.

Principles of use of MyRA in academic research

In June 2024, we conducted our latest review of the literature on the ethics and integrity of using GenAI and LLMs in qualitative research. This review explores the implications of AI integration in scholarly and other research work, examining both the potential benefits, limitations and ethical considerations. We have used these findings to guide our principles of use and adapt our product. We hope our findings provide valuable insights into the reliability of using these technologies, guiding researchers and users of our platform in their responsible and effective application.

1. Transparency and AI acknowledgment

We value transparency and endorse the recommendations of Resnik and Hosseini (2023) for the use of AI in research. The use of MyRA in the qualitative analysis must be clearly acknowledged in all relevant outputs. State the date of use and who performed the upload, and distinguish which codes and themes were generated by MyRA versus those generated by people (Lopez-Fierro & Nguyen, 2024).

2. COREQ and SRQR compliance

Follow the COREQ and SRQR guidelines on reporting qualitative research for all relevant outputs, focussing on COREQ’s domain 3, “analysis and findings”, i.e., “How many data coders coded the data? Did authors provide a description of the coding tree? Were themes identified in advance or derived from the data? What software, if applicable, was used to manage the data? Did participants provide feedback on the findings?”, and SRQR’s items on data processing, analysis, and trustworthiness, i.e., “Methods for processing data prior to and during analysis, including transcription, data entry, data management and security, verification of data integrity, data coding, and anonymization/de-identification of excerpts ”; “Process by which inferences, themes, etc., were identified and developed, including the researchers involved in data analysis; usually references a specific paradigm or approach; rationale”; “Techniques to enhance trustworthiness and credibility of data analysis (e.g., member checking, audit trail, triangulation); rationale”.

3. Researcher responsibility

MyRA is intended as a support tool and the user/researcher takes full responsibility for the final analysis. LLMs lack explicit knowledge, reflexivity, and understanding of meaning (including recognising irony and metaphor). Because they are stochastic (containing built-in randomness) the same input will not always produce the same output (Lindebaum and Fleming, 2024). Users should assess the input material themselves: best practice is to triangulate the extracted themes against those extracted by a human analyst and MyRA may be particularly useful to triangulate results by comparison to expert human analysis, because it will complement their specific perspective by producing an independent analysis. Although MyRA’s processes check the veracity of the outputs, the user should also cross-check the extracted quotes from the uploaded material. The user must develop their own theoretical analysis of the extracted themes and codes.

4. Anonymity, confidentiality and consent

No potentially identifying personal information should be uploaded to MyRA. For analysing text related to medical or other potentially sensitive research it is best practice to seek prior signed informed consent to the analysis, either specifically by MyRA or generally by an AI/LLM. MyRA has prepared a draft template that may be adapted. It is the responsibility of the user to ensure that any necessary ethical approval has been granted before any information is uploaded.

5. Acknowledgement of biases

Users should note issues of potential bias inherent to all large language models. They were trained on the open internet, where the training material may lack diversity. The output may reflect cognitive or language biases, including stereotypes, among those who wrote its training material or who refined the model during Reinforcement Learning from Human Feedback. Relevant for the analysis of qualitative data are the risk of anchoring bias, i.e., putting emphasis on certain information, model explainability, i.e., the difficulty in explaining precisely how complex outputs are arrived at (Ray, 2023) and producing high-probability words, which tend outputs towards being descriptive rather than interpretative (Wachinger et al., 2024). MyRA’s processes aim to ensure a focus on the inputted data and to directly use and show the quotations that support the proposed themes.

Using MyRA vs other AI tools

Tailored for analysts and researchers


MyRA

ChatGPT

Security
Does not train on your data May use your data for model training.Chat logs may be retained indefinitely if de-identified, and for up to 30 days when deleted
Length of output Consistently produces reports up to 15,000 words

Limited in output length, often requires multiple prompts for long content

Quality of output

Refined backend processes ensure high-quality, targeted responses

Variable quality, sometimes inconsistent or off-topic

Usability Streamlined, quick and user friendly for research needs More complex and less intuitive for complex tasks
Reliability Uses pre-processing of inputs and post-processing of outputs to ensure reliability and accuracy Can fabricate or unfaithfully report quotations from source material
Consistency Consistent approach, no need to refine questions, user-friendly without requiring programming skills Free text prompts can lead to inconsistent results, requiring users to refine prompts for useful responses
Session Independence Each upload is independent, ensuring no bias from previous content Uses information within sessions to refine responses, risking bias in later outputs
Authenticity and transparency of analysis MyRA is transparent about its limitations without misleading users May give appearance of complex analysis but lacks true capability without careful guidance and detailed prompting

MyRA uses the OpenAI API (application programming interface) to directly access their large language models (LLMs). Other publicly-available LLMs currently lack the power of OpenAI’s, though this may change. You need to consider the ability, reliability, trustworthiness, privacy, and security of any other provider, such as who has access to the data (including third-party providers), retention policies, whether your inputs are used to train the model, and certification. We will do a thorough assessment before using other AI models in our pipeline and will always be transparent about which models are being used.

Using OpenAI’s ChatGPT will not give the same results as using MyRA (see table above).

  • MyRA uses pre-processing of inputs into the API and post-processing of outputs to increase the length and reliability of our results. For example, ChatGPT can fabricate or unfaithfully report quotations from source material, whereas MyRA’s backend processes ensure the use of original, untruncated quotations.

  • MyRA’s configurations ensure that no content is used for training and that no content will be retained by OpenAI beyond a few hours, whereas ChatGPT’s chat logs are kept in your account and may be retained indefinitely if they have been de-identified and unlinked from your account, and for up to 30 days when deleted.

  • MyRA uses a consistent approach, avoiding the need to refine the questions you ask of the model to get a useful response. Free text prompts add inconsistency into the process and the results, and we know that most users and researchers do not want to become prompt engineers. We assume no familiarity with programming or conducting computer-assisted research analysis with other tools.

  • ChatGPT uses information within sessions to refine its responses, raising the risk that previous content will bias later outputs within a single session. Each upload for MyRA is independent.

  • ChatGPT can fool you into thinking it is more capable than it really is. It may give the appearance of complying with complex requests such as using a particular approach to qualitative research such as Grounded Theory. However, ChatGPT does not have the knowledge needed to conduct that analysis and in reality it produces word-by-word what looks like that analysis according to its language model. Without careful guidance such as chain-of-thought prompting the output will not be the result of a real research process, even if it looks plausible.

FAQs

  • MyRA generates detailed, transparent reports that allow researchers to easily verify and validate the AI's findings, ensuring the integrity of your research.

  • All data uploaded to MyRA is encrypted, and we delete all data daily to ensure it is not stored beyond its purpose. You retain full ownership of your data. Check our Data Security section for more details.

  • No, MyRA is designed to assist researchers by streamlining the analysis process. It provides a head start on data analysis but does not replace the critical thinking and expertise that human researchers bring to their work. The themes and supporting quotes generated by MyRA are intended to provide insights to the user, but users are solely responsible for their own final analysis and conclusions. As with automated transcription, the aim is to produce a ‘first draft’ that will be checked and refined by the user.

  • Yes, MyRA's insights are based on rigorous analysis and transparency, allowing you to verify and validate the findings. We adhere to ethical guidelines and best practices to ensure accuracy and reliability. However, just like you would as a principal investigator, it is your responsibility as a researcher to check what your research assistant produces and that you have a contextualised understanding of your topic. We’ve presented the data in a way that can help you do that.

  • MyRA is committed to promoting the responsible use of AI in research. We prioritize data security, transparency, and accountability, adhering to ethical guidelines and best practices to ensure our platform supports researchers ethically and effectively. Our ethics advisor, Matt, brings two decades of research ethics experience to ensure MyRA upholds the highest ethical standards.

  • MyRA recommends not uploading non-anonymised data to avoid any potential breach of data protection. The decision whether to upload potentially identifying material is that of the user alone; MyRA employees and contractors do not have access to the uploaded data and MyRA does not itself de-identify uploaded data.

  • Obtaining appropriate consent is the responsibility of the user and not MyRA. MyRA will not ask for signed consent forms, though MyRA may ask for a blank copy if necessary. Users may be able to use automated analysis without further consent from participants, though they may wish to consult their IRB/REC on whether re-consenting is necessary. If any potentially identifying information and/or sensitive information will be uploaded, the user should obtain explicit consent for this from the participants. A template/example consent form is available here; the wording of the form used to obtain consent from participants is the responsibility of the user/researcher.

  • MyRA uses the OpenAI API. This a large language model based on the Transformer technology. The tool was trained on a text corpus, and then refined in a Reinforcement Learning from Human Feedback (RLHF): the language of the corpus and the human trainers will affect the specific phrasing of the outputs. The model is stochastic, meaning that the way the model responds to prompts in not deterministic. Due to this, repeated running of the same input will not necessarily give exactly the same output.

  • MyRA, meaning either the company or the tool, may not be listed as an author on any outputs. MyRA employees and contractors may not be listed as an author on outputs using material from MyRA unless they made a direct and significant intellectual contribution to that project.

  • We have listed a few guidelines below on which we have based our assessment: 


    Living guidelines on the responsible use of generative AI in research, European Commission, March 2024

  • Yes you can. If you already have themes that you have drawn from your data and are looking to cross check what MyRA can find in your data, you can do so. This is similar to in=context learning (Xie and Min, 2022).