Due to a data consent fiasco, I have now lost a year’s worth of work in a PhD program which is of limited length and with limited funding.

My supervisor’s solution is to attempt to get written consent from the participants now and to only use the subset of data for which I am able to get written consent. For reasons I needn’t expound here, this subset of data with written consent will be laughably small and will in no way be possible to write quality papers using it. Moreover, even if these papers get accepted, I feel embarrassed to write them and will have no answer to questions regarding my data other than “I wasn’t told to get written consent, which is why I have so little data”, which I find reflects even worse on me than saying nothing does.

It would likely take 3-6 months and an extreme amount of personal stress and effort to collect new data, and I would then have used roughly 1.5 years of my PhD on data collection, with maybe 2.5 left over for actual work. How do I recover from losing roughly a year in my PhD due to bad data?

If I am publishing a conference paper and/or journal article which is about an ongoing, larger project, is it acceptable or even heard of to not only announce expected future work but also to explicitly request collaboration/support regarding said future work? — since the quality of my work (and the work of many others) is directly influenced by the amount and quality of the data available to work with, I would like to polish a framework which I use to collect data for a particular phenomenon of interest into one which is even more easily set-up and executed than it is now. Of course, this could also be then used for other people to easily collect the same sort of data… and so it would be nice if other parties interested in the work did so and we ended up pooling our data together.

I haven’t seen this sort of implicit “I only have so much time to run experiments so it’d be totally awesome if you guys did it too”, so I’m not sure if it’s just not done that commonly or if it’s not “a thing” at all.

Background:
I am a PhD student and am one step away from graduating. I have created a dataset in 3D microscopy vision, particularly 3D surface reconstruction from scanning electron microscope images, and the work has been already published in a highly respected journal. I would like to share the dataset with the research community to draw attention and possibly evaluate the contribution(s).

Question:
I am wondering what are the typical channels that scholars use to share datasets?

I’ve found some source code that creates a visualization I like. I’ve (slightly) modified it and used exclusively my own data in it. Is it right to cite the github repository I acquired the original code from? I don’t want to be improper or misleading at all, but then it’s not normal to cite matplotlib, say, which this doesn’t feel much different to (except it’s code from a single author) so I’m unsure on correct procedure. I’m trying not to inflate the word count unnecessarily or give the false impression that the data is someone else’s (which I’m worried citing the code might do) – but equally I don’t want to give off any false impressions of authorship on my part.

Edit: To clarify I’m not distrusting the code at all, I’m just using it to generate a figure in my paper.

There are a number of questions about accessing the data corresponding to publications. In the recent years, many journals have improved their data availability policies and this is less of an issue. On the other hand, I do not find much about the extent of exploiting these data that is thought acceptable.

Firstly, I believe most scientific communities accept that other’s data are replotted along with similar data acquired by the authors (showing repeatability) or along with similar data from yet other groups (meta-analysis type of figure). Still, I have the impression that in many communities people rather plot only their own data and comment that this is the same behavior as reported by so & so. So is there a policy? Should one ask for author’s permission? Should one specify exactly what data of theirs will be shown along with what other data?

Then, there’s the question of data re-analysis. For instance, say authors have published the time evolution of several variables of interest, and Methods make clear that these were acquired simultaneously on the same sample. Is it OK e.g. to publish separately from them a figure which plots one variable from their dataset against another? I guess the answer to this question can be yes or no, depending on the way this is done. I expect that in most cases, having this plot with only their data as my Fig 1, possibly overlaid with a model prediction, and being the main result of my paper, will be judged problematic. Is it? If yes, what is the red line between acceptable re-analysis (e.g. I’ve already seen data re-plotted in log scale, but the figures with this data were in supplementary materials only). And of course, same questions as above apply.

After an extreme amount of effort, I am now in possession of a very large dataset of human experiments which is required for my research. However, after speaking with some of my colleagues about writing papers about it, I was told that I need to have had each participant sign a very specific consent form or else I can’t distribute the data… to anyone (yes, I know this is country-specific, but most countries require at at least some legally-valid form of consent).

I hadn’t thought about this because I was told by my supervisor to collect the data, so I did. Now they are agreeing with my colleagues in that I need to have had them sign a consent form… but if they told me to collect the data, then methinks they could have been prudent enough to say “oh, and before you go persuade 50+ random people to participate in an experiment, don’t forget to have them sign this form or else you can’t share the data with anyone at all”.

Are PhD supervisors responsible for telling their students about the legal requirements of their work? If not, is it expected that all “learning” done while a PhD student be from making horrible mistakes? Should I basically treat my supervisor not as a boss but as just some random person who might not actually know what I should do?

My project is based on data hosted by a different university than the one where I used to work. When I requested the data, I understood that the work (and cost) to extract it from the database increases with the number of items I wanted. I had no research funds at the time and therefore asked for a rather small sample, with an informal agreement with the director of the inistitute that the person extracting the data would be my collaborator and coauthor.

Several months passed before I received the data and then the sample turned out to be too small to be really useful. I did not make another request as the data extraction process appeared so time-consuming. Instead, I moved on to work with other projects for a couple of years.

The data became relevant again when I started as a visiting researcher at this institute. Although I wasn’t employed by the institute, I had direct access to the database and could download all of it in a matter of minutes. I expected that the person who extracted the first sample would be willing to collaborate on data classification and other things where he was an expert. Instead, he seemed hostile and would only answer to exactly what I asked, rather than make an attempt to be helpful.

The most recent development is that the institute made most of this database and classification open for anyone to download.

What should I do to get out of the original data use agreement? I feel trapped in a forced collaboration with a person with whom I’m not in speaking terms. I’m not in a position to renegotiate or entitled to any help from the institute, as it is not my employer.

I took this situation more seriously than anyone should, as my depression renewed and I went to psyhotherapy for nine months. That’s telling something about the amount of time I have wasted with this dilemma.

My topic will discuss how which leadership style, Transactional or Transformational will have the best impact on service quality. SERVQUAL model will be used to measure service quality.

However, I’m not really sure on how to present the data. For example, a questionnaire from an employee will show tendencies of transactional leadership showing positive impact on service quality and another with transformational showing negative.So I am basically having two independent variables (transactional and transformational leadership) and an dependent (service quality).

How do I present and analyze which independent variable affects service quality most since different employees will lean mention different leadership styles and different service quality views?

I’m coding some transcripts at the moment and the data I am working with is not anonymized yet. I’m a research assistant, I did not collect the data and I am obviously not the PI. What strikes me is that I know some of the people interviewed so it’s a bit sensitive for me to be coding that non-anonymized data.

Which has me wondering, especially in cases where the data is analyzed by a big team, when does the data typically get anonymized and who does it? Would it be common to ask the transcriber to do it?

A third co-author C of a study in 2016 available on-line (under subscription) want to re-use the data (included in the figures or tables) of this paper with for others co-authors (A, B, D, E). Co-authors C and A are in conflict.These data will be presented and analyzed in a different manner to support its own and new data set in a paper that Co-author C want to submit as unique author. Co-author C will only use the published data (and not the raw data) of the study of 2010. Some raw data are available on-line (under subscription). However, co-author C only collected but not processed an insignificant part these data. After asking the editor permission that co-author C have (he is limited to re-use only 3 tables or figures), do co-author C has the right to publish part of theses data in a new paper as a unique author? these data are considered to represent (30%) of the data set for the new paper. Does the co-author C has the right to re-use the data without permission of the others co-authors (A, B, D, E)? The same may be also applied to the raw data already published as supporting data? IS the co-author C allowed to include in “material and method ” or in “results” section some brief sentences describing how he gets the some of these data and these previous results in its new paper as unique author?
The PI request to include all the co-authors (A, B, D, E) for the new publication of co-author C. Co-author C is not agreed as most of the co-authors (A, B, D, E) did not participate in the elaboration, analyze and writing task of the new paper.