I am writing my thesis (computer sience) and wonder how to refer to an algorithm which i reimplement as a part of the thesis.

There is a paper describing the original algorithm and I am only implementing this algorithm in my thesis.

I considered to refer to it as ‘baseline’ or ‘reference algorithm’ but ‘baseline do have a different meaning in data mining and in my opinion (as a not native english person ‘reference algorithm’
sounds like the algorithm completely differ from mine. The algorithm itself is not named and the paper is having a name that is too long (i.e for the TOC)

I’ve recently began working on machine learning, and I was a mathematician before then. Despite such background, as ML is a fast moving area, since I have studied the recent progress considerably and numerous papers, I’m confident in my knowledge. However, I’m neither good at nor fast at implementing my ideas to algorithms. I have no problem with writing pseudocodes, adding details and designing experiments, but it takes bit more time to implement algorithms in Tensorflow and PyTorch for my designed neural network not too slowly. I’m not used to debugging such codes either, since it’s much harder than simple, fast algorithms I had to learn before. Therefore, it takes an incredibly long time to conduct an experiment by myself.

An interesting paper was published 3 days ago, and the author does neither understand its full implication nor have an access to enough computation resources. I can attempt a very interesting experiment, and if successful (which is likely), it would be a great deal for both the author and me. However, due to my incredible slowness, I probably have no chance for it, since others will probably publish before I will. I need someone with machine learning knowledge who is not too bad at coding.

I have three choices: grad students in my university, the aforementioned author and well-known experts in the field. I’ve never met the people in last two options, but both are familiar with the topic. The author doesn’t possess a good coding skill either and failed at coding one thing I’m trying to do, since he is a mere student.

I’m not sure whether I can trust either of them. If I will propose my ideas and coauthoring a paper, since I lack coding experience and it bothers them, there’s no guarantee that, even if they don’t even know the paper (for experts’ case) or my idea yet, they can just claim to have the idea already or just ignore my email, and they can publish the paper based on my idea before I do. Grad students probably can help me, but I’m not sure whether they will show an interest in or familiarity with the topic.

I believe this situation can be generalized. In such cases, what is the best choice? Any advice? By the way, as far as I know, it’s uncommon to just upload to Arxiv a non-theoretical paper with proposal of experiments and generalization of an existing algorithm. Even if I do, it will not give me much credits, since if it does, then proposal papers would have abounded.

I’ve found some source code that creates a visualization I like. I’ve (slightly) modified it and used exclusively my own data in it. Is it right to cite the github repository I acquired the original code from? I don’t want to be improper or misleading at all, but then it’s not normal to cite matplotlib, say, which this doesn’t feel much different to (except it’s code from a single author) so I’m unsure on correct procedure. I’m trying not to inflate the word count unnecessarily or give the false impression that the data is someone else’s (which I’m worried citing the code might do) – but equally I don’t want to give off any false impressions of authorship on my part.

Edit: To clarify I’m not distrusting the code at all, I’m just using it to generate a figure in my paper.

A bit of background: I am working in a R&D department of quite a big name in the IT industry, and we are trying, clashing and combining different ideas and cutting-edge inventions to tackle our engineering problem.

Now the problem itself: There is an excellent paper from CVPR 2017 that I want to try for my project. However, the authors did not publish nor their code, nor their trained FCNN (Fully-Convolutional Neural Network). Now, the project that I am working on is only on the infancy stage, meaning that we need to make some Proof-of-Concept product as soon as possible. Obviously, for this stage, training a deep CNN is a relatively very time-consuming and expensive process, so ideally I would want to ask the authors for the trained model of their FCNN. Of course, if the concept works, we will re-train and fine-tune the whole thing for our specific task, so the author’s CNN will not end up in the final product.

Question: How to correctly ask the authors for it? Or is it correct to ask researchers for their results for a project at all?

I have several code snippets shared as gists in GitHub. How to make them citable with a DOI?

I know whole repositories in GitHub can be made citable using Zenodo or Figshare (guide). But as I have mainly R functions as gists, I don’t want to build entire R packages to make it citable.

What are the other options available?

I wrote some software as part of an internship. It worked well for the task and we published a paper on it. However, because the internship was at a company, not my graduate institution, the code is technically property of the company and they do not want to release it. I’ve received several requests for the code, and I know that releasing the software will increase my paper’s citations and impact. Is it okay to re-implement the code I wrote and release it as open source, or could that violate some laws or policies?

For the record, when I say re-implement, I mean start from scratch, not reuse most of the same code and change a few things.

I want to create a dataset (offline, only for my research purposes) based on scientific publications (research papers), which are somehow connected to software source code (for example given by a git-repository). The research domains of the publications are irrelevant, as long as they are talking about related software source code (for example by mentioning functions or variables).

In order to create the dataset, I need journals/websites where papers and their related code are listed. I already found the Journal of Open Research Software, which fits my requirements, but I am pretty sure there are more resources like this out there.

How can I find publication venues in which papers regularly include links to related source code?