Il New York Times, e un gruppo di scrittori, citano OpenAI e Microsoft per violazione di copyright, consistente nella riproduzione di articoli (o di loro libri) per allenare la loro Intelligenza Artificiale e nel loro inserimento nell’output

– I –

Il NYT il 27 dicembre dà notizia di aver fatto causa per il saccheggio dei suoi articoli e materiali per allenare ChatGP e altri sistemi di AI e l’uso nell’output dai prompts degli utenti.

Offre pure il link all’atto di citazione. depositato presso il South. Dist. di NY il 27 dicembre 2023 , Case 1:23-cv-11195 .

Qui interessante è la decrizione del funzionamento della generative AI e del suo training, oltre alla storia di OpenAI che -contrariamente agli inizi (solo strategicamente open, allora vien da dire)- open adesso non lo è più , §§ 55 ss.: v. § 75 ss

Le condotte in violazione (con molti esempi reali -screenshot- delle prove eseguite dall’attore, spesso a colori: anzi, viene detto in altro articolo che l’Exhibit J contiene 100 esempi ; lo stesso sito in altro articolo offre il link diretto a questo allegato J) sono:

– Unauthorized Reproduction of Times Works During GPT Model Training,§ 83 ss

– Embodiment of Unauthorized Reproductions and Derivatives of Times Works in
GPT Models, § 98 ss

– Unauthorized Public Display of Times Works in GPT Product Outputs, § 102 ss;

-Unauthorized Retrieval and Dissemination of Current News, § 108 ss.

V. ora su Youtube  l’interessante analisi riga per riga della citazione svolta da Giovanni Ziccardi.

– II –

Giunge poi notizia di analoga iniziativa giudiziaria  (qui però come class action) promossa da scrittori USA. Vedasi la citazione depositata il 19 dicembre 2023 al South. Dist. di NY da Alter, Bird, Branch ed altri contro più o meno gli stessi convenuti. I datasets per il training sono presi da Common Crawl, Webtext, Books1 and 2,  Wikipedia etc, § 72 (lo dice lo stesso OPenAI).

L’allegata modalità seguita per la violazione:

<<90. Defendants used works authored and owned by Plaintiffs in the training of their GPT models, and in doing so reproduced these works and commercially exploited them without a license.
91. While OpenAI and Microsoft have kept the contents of their training data secret, it is likely that, in training their GPT models, they reproduced all or nearly all commercially successful nonfiction books. As OpenAI investor Andreesen Horowitz has admitted, “large language models,” like Defendants’ GPT models, “are trained on something approaching the entire corpus of the written word,” a corpus that would of course include Plaintiffs’ works.
92. The size of the Books2 database—the “internet based books corpora” that
Defendants used to train GPT-3, GPT-3.5, and possibly GPT-4 as well—has led commentators to believe that Books2 is comprised of books scraped from entire pirated online libraries such as LibGen, ZLibrary, or Bibliotik. Shawn Presser, an independent software developer, created an open-source set of training data called Books3, which was intended to give developers, in his words, “OpenAI-grade training data.” The Books3 dataset, similar in size to Books2, was built
from a corpus of pirated copies of books available on the site Bibliotik. Works authored and owned by Plaintiffs Alter, Bird, Branch, Cohen, Linden, Okrent, Sancton, Sides, Schiff, Shapiro, Tolentino, and Winchester are available on Books3, an indication that these works were also likely included in the similarly sized Books2>>.

Vedremo l’esito (magari già la comparsa di costitzione, speriamo)

– III –

“Chat GPT Is Eating the World” pubblica una utile lista delle cause pendenti in USA azionanti il copyright contro l’uso in AI (sono 15 , quasi tutte class actions).

Ci trovi anche il fascicolo processuale della sopra cit. NYT Times c. Microsoft-OpenAI (v. DOCKET,  link diretto qui e qui nei vari Exhibit l’elenco dell’enorme quantità di articoli copiati)

– IV –

Resta però da vedere se allenare  i LARGE LANGUAGE MODELS con materiale protetto ne determini realmente una “riproduzione” sotto il profilo tecnico/informatico: o meglio se tecnicamente si dia un fenomeno che possa giuridicamente qualificarsi “riproduzione”.     Kevin Bryan su X  dice di no ; Lemley-CAsey pure affermano la legittimità per policy reasons . Ma data la norma in vigore, si deve accertare se vi sia o meno riproduzione: in caso positivo, infatti, l’eventuale elaborazione creativa (tutto da vedere se ricorra e come vada giudicata la creatività) non può prescindere dal consenso dei titolari delle opere riprodotte.

Che queste AI richeidano di accedere a materialiper lo più protetti è com,prensibile: lo dice OpenAI (v. Dan Milmo 8 genn. 2023 nel Guardian). Ma non aiuta a risolvere detto dubbio tecnico-giuridico

Ancora su AI, data scraping e violazione di copyright (questa volta per lo più negata)

La corte del distr. Nord della California  30 ottobre 2023, Case 3:23-cv-00201-WHO, Andersen v. Stability AI, DeviantArt, Midjourney, esamina il tema in oggetto (segnalazione e link di Jess Miers su X).

Le domande sono tutte rigettate tranne quelal verso Stability, per la quale è cocnessa facoltà di modifica:

<<3. Direct Infringement Allegations Against Stability Plaintiffs’ primary theory of direct copyright infringement is based on Stability’s creation and use of “Training Images” scraped from the internet into the LAION datasets and then used to train Stable Diffusion. Plaintiffs have adequately alleged direct infringement based on the allegations that Stability “downloaded or otherwise acquired copies of billions of copyrighted images without permission to create Stable Diffusion,” and used those images (called “Training Images”) to train Stable Diffusion and caused those “images to be stored at and incorporated into Stable Diffusion as compressed copies.” Compl. ¶¶ 3-4, 25-26, 57. In its “Preliminary Statement” in support of its motion to dismiss, Stability opposes the truth of plaintiffs’ assertions. See Stability Motion to Dismiss (Dkt. No. 58) at 1. However, even Stability recognizes that determination of the truth of these allegations – whether copying in violation of the Copyright Act occurred in the context of training Stable Diffusion or occurs when Stable Diffusion is run – cannot be resolved at this juncture. Id. Stability does not otherwise oppose the sufficiency of the allegations supporting Anderson’s direct copyright infringement claims with respect to the Training Images>>.

Provvedimento itneressante poer chi si occupa del tema, dato che da noi ancora non se ne son visti.

Altra azione contro società di A. I., basata su diritto di autore: Concord Music, Universal Music e altri c. Anthropic PBC

Tramite il modello AI chiamato Claude2, Anthropic violerebbe il copyright di molte canzoni (della loro parte letterariA) . Così la citazione in giudizio da parte di molti produttori (tra i maggiori al mondo, parrebbe).

Ne dà notizia The Verge oggi 19 ottobre (articolo di Emilia David), ove trovi pure il link all’atto introduttivo di citazione in giudizio.

Riposto solo i passi sul come fuinziona il traininig e l’output di Claude2 e poi dove stia la vioalzione.

<<6 . Anthropic is in the business of developing, operating, selling, and licensing AI technologies. Its primary product is a series of AI models referred to as “Claude.” Anthropic builds its AI models by scraping and ingesting massive amounts of text from the internet and potentially other sources, and then using that vast corpus to train its AI models and generate output based on this copied text. Included in the text that Anthropic copies to fuel its AI models are the lyrics to innumerable musical compositions for which Publishers own or control the copyrights, among countless other copyrighted works harvested from the internet. This copyrighted material is not free for the taking simply because it
can be found on the internet. Anthropic has neither sought nor secured Publishers’ permission to use their valuable copyrighted works in this way. Just as Anthropic does not want its code taken without its authorization, neither do music publishers or any other copyright owners want their works to be exploited without permission.
7.
Anthropic claims to be different from other AI businesses. It calls itself an AI “safety and research” company, and it claims that, by training its AI models using a so-called “constitution,” it ensures that those programs are more “helpful, honest, and harmless.” Yet, despite its purportedly principled approach, Anthropic infringes on copyrights without regard for the law or respect for the creative community whose contributions are the backbone of Anthropic’s infringing service.
8.
As a result of Anthropic’s mass copying and ingestion of Publishers’ song lyrics, Anthropic’s AI models generate identical or nearly identical copies of those lyrics, in clear violation of Publishers’ copyrights. When a user prompts Anthropic’s Claude AI chatbot to provide the lyrics to songs such as “A Change Is Gonna Come,” “God Only Knows,” “What a Wonderful World,” “Gimme Shelter,” “American Pie,” “Sweet Home Alabama,” “Every Breath You Take,” “Life Is a Highway,” “Somewhere Only We Know,” “Halo,” “Moves Like Jagger,” “Uptown Funk,” or any other number of Publishers’ musical compositions, the chatbot will provide responses that contain all or significant portions of those lyrics>>.

<<11. By copying and exploiting Publishers’ lyrics in this manner—both as the input it uses to train its AI models and as the output those AI models generate—Anthropic directly infringes Publishers’ exclusive rights as copyright holders, including the rights of reproduction, preparation of derivative works, distribution, and public display. In addition, because Anthropic unlawfully enables, encourages, and profits from massive copyright infringement by its users, it is secondarily liable for the infringing acts of its users under well-established theories of contributory infringement and vicarious infringement. Moreover, Anthropic’s AI output often omits critical copyright management information regarding these works, in further violation of Publishers’ rights; in this respect, the composers of the song lyrics frequently do not get recognition for being the creators of the works that are being distributed. It is unfathomable for Anthropic to treat itself as exempt from the ethical and legal rules it purports to embrace>>

Come funziona il training di AI:

<<54. Specifically, Anthropic “trains” its Claude AI models how to generate text by taking the following steps:
a.  First, Anthropic copies massive amounts of text from the internet and potentially other sources. Anthropic collects this material by “scraping” (or copying or downloading) the text directly from websites and other digital sources and onto Anthropic’s servers, using automated tools, such as bots and web crawlers, and/or by working from collections prepared by third parties, which in turn may have been harvested through web scraping. This vast collection of text forms the input, or “corpus,” upon which the Claude AI model is then trained.
b.   Second, as it deems fit, Anthropic “cleans” the copied text to remove material it perceives as inconsistent with its business model, whether technical or subjective in nature (such as deduplication or removal of offensive language), or for other  reasons.
In most instances, this “cleaning” process appears to entirely ignore copyright infringements embodied in the copied text.
c.   Third, Anthropic copies this massive corpus of previously copied text into computer memory and processes this data in multiple ways to train the Claude AI models, or establish the values of billions of parameters that form the model. That includes copying, dividing, and converting the collected text into units known as “tokens,” which are words or parts of words and punctuation, for storage. This process is referred to as “encoding” the text into tokens. For Claude, the average token is about 3.5 characters long.4
d.   Fourth, Anthropic processes the data further as it “finetunes” the Claude AI model and engages in additional “reinforcement learning,” based both on human feedback and AI feedback, all of which may require additional copying of the collected text.
55.   Once this input and training process is complete, Anthropic’s Claude AI models generate output consistent in structure and style with both the text in their training corpora and the reinforcement feedback. When given a prompt, Claude will formulate a response based on its model, which is a product of its pretraining on a large corpus of text and finetuning, including based on reinforcement learning from human feedback. According to Anthropic, “Claude is not a bare language model; it has already been fine-tuned to be a helpful assistant.”5 Claude works with text in the form of tokens during this processing, but the output is ordinary readable text>>.

Violazioni:

<<56.
First, Anthropic engages in the wholesale copying of Publishers’ copyrighted lyrics as part of the initial data ingestion process to formulate the training data used to program its AI models.
57.
Anthropic fuels its AI models with enormous collections of text harvested from the internet. But just because something may be available on the internet does not mean it is free for Anthropic to exploit to its own ends.
58.
For instance, the text corpus upon which Anthropic trained its Claude AI models and upon which these models rely to generate text includes vast amounts of Publishers’ copyrighted lyrics, for which they own or control the exclusive rights.
59.
Anthropic largely conceals the specific sources of the text it uses to train its AI models. Anthropic has stated only that “Claude models are trained on a proprietary mix of publicly available information from the Internet, datasets that we license from third party businesses, and data that our users affirmatively share or that crowd workers provide,” and that the text on which Claude 2 was trained continues through early 2023 and is 90 percent English-language.6 The reason that Anthropic refuses to disclose the materials it has used for training Claude is because it is aware that it is copying copyrighted materials without authorization from the copyright owners.
60.
Anthropic’s limited disclosures make clear that it has relied heavily on datasets (e.g., the “Common Crawl” dataset) that include massive amounts of content from popular lyrics websites such as genius.com, lyrics.com, and azlyrics.com, among other standard large text
collections, to train its AI models.7
61.
Moreover, the fact that Anthropic’s AI models respond to user prompts by generating identical or near-identical copies of Publishers’ copyrighted lyrics makes clear that Anthropic fed the models copies of those lyrics when developing the programs. Anthropic had to first copy these lyrics and process them through its AI models during training, in order for the models to subsequently disseminate copies of the lyrics as output.
62.
Second, Anthropic creates additional unauthorized reproductions of Publishers’ copyrighted lyrics when it cleans, processes, trains with, and/or finetunes the data ingested into its AI models, including when it tokenizes the data. Notably, although Anthropic “cleans” the text it ingests to remove offensive language and filter out other materials that it wishes to exclude from its training corpus, Anthropic has not indicated that it takes any steps to remove copyrighted content.
63.
By copying Publishers’ lyrics without authorization during this ingestion and training process, Anthropic violates Publishers’ copyrights in those works.
64.
Third, Anthropic’s AI models disseminate identical or near-identical copies of a wide range of Publishers’ copyrighted lyrics, in further violation of Publishers’ rights.
65.
Upon accessing Anthropic’s Claude AI models through Anthropic’s commercially available API or via its public website, users can request and obtain through Claude verbatim or near-verbatim copies of lyrics for a wide variety of songs, including copyrighted lyrics owned and controlled by
Publishers. These copies of lyrics are not only substantially but strikingly similar to the original copyrighted works>>

<<70.
Claude’s output is likewise identical or substantially and strikingly similar to Publishers’ copyrighted lyrics for each of the compositions listed in Exhibit A. These works that have been infringed by Anthropic include timeless classics as well as today’s chart-topping hits, spanning a range of musical genres. And this represents just a small fraction of Anthropic’s infringement of Publishers’ works and the works of others, through both the input and output of its AI models.
71.
Anthropic’s Claude is also capable of generating lyrics for new songs that incorporate the lyrics from existing copyrighted songs. In these cases, Claude’s output may include portions of one copyrighted work, alongside portions of other copyrighted works, in a manner that is entirely inconsistent and even inimical to how the songwriter intended them.
72.
Moreover, Anthropic’s Claude also copies and distributes Publishers’ copyrighted lyrics even in instances when it is not asked to do so. Indeed, when Claude is prompted to write a song about a given topic—without any reference to a specific song title, artist, or songwriter—Claude will often respond by generating lyrics that it claims it wrote that, in fact, copy directly from portions of Publishers’ copyrighted lyrics>>.

<<80.
In other words, Anthropic infringes Publishers’ copyrighted lyrics not only in response to specific requests for those lyrics. Rather, once Anthropic copies Publishers’ lyrics as input to train its AI models, those AI models then copy and distribute Publishers’ lyrics as output in response to a wide range of more generic queries related to songs and various other subject matter>>.

La citazione in giudizio dell’associazione scrittori usa contro Open AI

E’ reperibile in rete (ad es qui) la citazione in giuidizio avanti il South. Dist. di New Yoerk contro Open AI per vioalzione di copyright proposta dalla importante Autorhs Guild e altri (tra cui scrittori notissimi) .

L’allenamento della sua AI infatti pare determini riproduzione e quindi (in assenza di eccezione/controdiritto) violazione.

Nel diritto UE l’art. 4 della dir 790/2019 presuppone il diritto  di accesso all’opera per  invocare l’eccezione commerciale di text and data mining:

<< 1. Gli Stati membri dispongono un’eccezione o una limitazione ai diritti di cui all’articolo 5, lettera a), e all’articolo 7, paragrafo 1, della direttiva 96/9/CE, all’articolo 2 della direttiva 2001/29/CE, all’articolo 4, paragrafo 1, lettere a) e b), della direttiva 2009/24/CE e all’articolo 15, paragrafo 1, della presente direttiva per le riproduzioni e le estrazioni effettuate da opere o altri materiali cui si abbia legalmente accesso ai fini dell’estrazione di testo e di dati.

2. Le riproduzioni e le estrazioni effettuate a norma del paragrafo 1 possono essere conservate per il tempo necessario ai fini dell’estrazione di testo e di dati.

3. L’eccezione o la limitazione di cui al paragrafo 1 si applica a condizione che l’utilizzo delle opere e di altri materiali di cui a tale paragrafo non sia stato espressamente riservato dai titolari dei diritti in modo appropriato, ad esempio attraverso strumenti che consentano lettura automatizzata in caso di contenuti resi pubblicamente disponibili online.

4. Il presente articolo non pregiudica l’applicazione dell’articolo 3 della presente direttiva>>.

Il passaggio centrale (sul se ricorra vioalzione nel diritto usa) nella predetta citazione sta nei §§ 51-64:

<<51. The terms “artificial intelligence” or “AI” refer generally to computer systems designed to imitate human cognitive functions.
52. The terms “generative artificial intelligence” or “generative AI” refer specifically to systems that are capable of generating “new” content in response to user inputs called “prompts.”
53. For example, the user of a generative AI system capable of generating images
from text prompts might input the prompt, “A lawyer working at her desk.” The system would then attempt to construct the prompted image. Similarly, the user of a generative AI system capable of generating text from text prompts might input the prompt, “Tell me a story about a lawyer working at her desk.” The system would then attempt to generate the prompted text.
54. Recent generative AI systems designed to recognize input text and generate
output text are built on “large language models” or “LLMs.”
55. LLMs use predictive algorithms that are designed to detect statistical patterns in the text datasets on which they are “trained” and, on the basis of these patterns, generate responses to user prompts. “Training” an LLM refers to the process by which the parameters that define an LLM’s behavior are adjusted through the LLM’s ingestion and analysis of large
“training” datasets.
56. Once “trained,” the LLM analyzes the relationships among words in an input
prompt and generates a response that is an approximation of similar relationships among words in the LLM’s “training” data. In this way, LLMs can be capable of generating sentences, p aragraphs, and even complete texts, from cover letters to novels.
57. “Training” an LLM requires supplying the LLM with large amounts of text for
the LLM to ingest—the more text, the better. That is, in part, the large in large language model.
58. As the U.S. Patent and Trademark Office has observed, LLM “training” “almost
by definition involve[s] the reproduction of entire works or substantial portions thereof.”4
59. “Training” in this context is therefore a technical-sounding euphemism for
“copying and ingesting.”
60. The quality of the LLM (that is, its capacity to generate human-seeming responses
to prompts) is dependent on the quality of the datasets used to “train” the LLM.
61. Professionally authored, edited, and published books—such as those authored by Plaintiffs here—are an especially important source of LLM “training” data.
62. As one group of AI researchers (not affiliated with Defendants) has observed,
“[b]ooks are a rich source of both fine-grained information, how a character, an object or a scene looks like, as well as high-level semantics, what someone is thinking, feeling and how these states evolve through a story.”5
63. In other words, books are the high-quality materials Defendants want, need, and have therefore outright pilfered to develop generative AI products that produce high-quality results: text that appears to have been written by a human writer.
64. This use is highly commercial>>

Plagio di lettera da parte di un breve saggio: “The Kindest” in Larson v. Dorland Perry

La corte del Massachussets 14.09.2023 n. Case 1:19-cv-10203-IT, larson v. Dorland Perry, (segnalato e linkato dal prof. Edward Lee su X ).

Qui la peculiarità fattuale è che il lavoro plagiario si è evoluto in tre versioni, sempre più lontane dal lavoro originale.

Sulla substantial similarity : <<“Substantial similarity is an elusive concept, not subject to precise definition.” Concrete Mach. Co. v. Classic Lawn Ornaments, Inc., 843 F.2d 600, 606 (1st Cir. 1988). The inquiry is a “sliding scale”: If there are many ways to express a particular idea, then the burden of proof on  the plaintiff to show substantial similarity is lighter. Id. at 606-07. Here, there are many ways to write a letter, even one dealing specifically with kidney donations. Larson Mem. SJ, Ex. 8 [Doc. No. 189-8] (examples of sample letters from organ donors/family members of organ donors to recipients); Id., Ex. 1 ¶ 7 (Larson Aff.) [Doc. No. 189-1]>>.

Sulle parti non originali:

<<However, “[n]o infringement claim lies if the similarity between two works rests necessarily on non-copyrightable aspects of the original—for example, ‘the underlying ideas, or expressions that are not original with the plaintiff.’” TMTV, Corp. v. Mass. Prods., Inc., 645 F.3d 464, 470 (1st Cir. 2011) (internal citation omitted). “[I]t is only when ‘the points of dissimilarity not only exceed the points of similarity, but indicate that the remaining points of similarity are (within the context of plaintiff’s work) of minimal importance either quantitatively or qualitatively, [that] no infringement results.’” Segrets, Inc., 207 F.3d at 66. “‘The test is whether the accused work is so similar to the plaintiff’s work that an ordinary reasonable person would conclude that the defendant unlawfully appropriated the plaintiff’s protectible expression by taking material of substance and value.’” Id. at 62. “While summary judgment for a plaintiff on these issues is unusual,” it may be warranted based on the factual record. Id.; accord T-Peg, Inc. v. Vt. Timber Works, Inc., 459 F.3d 97, 112 (1st Cir. 2006)>>.

Sui dati fattuali sostenenti il giudizio di accertato plagio nella prima versione:

<<The 2016 Brilliance Audio Letter.8 As Larson concedes, the undisputed evidence mandates a conclusion that the 2016 Letter is substantially similar to the Dorland Letter. The Dorland Letter is approximately 381 words long, Dorland Mem. SJ, Ex. C [Doc. No. 181-3]; of those 381 words, the 2016 Letter copies verbatim approximately 100, and closely paraphrases approximately 50 more, Larson Mem. SJ, Appendix I [Doc. No. 193-1]. Many of these verbatim or near-verbatim lines gave the Dorland Letter its particular character, including: “My gift…trails no strings”; “I [focused/channeled] [a majority of] my [mental] energ[y/ies] into imagining and celebrating you”; “I accept any level of involvement,…even if it is none”; “To me the suffering of strangers is just as real”; and “I [wasn’t given/didn’t have] the opportunity to form secure attachments with my family of origin.” Id. The 2016 Letter also follows an identical structure to the Dorland Letter: a paragraph introducing the donor, including information on race, age, and gender; a paragraph explaining how the donor discovered the need for kidney donation; a paragraph explaining the donor’s traumatic childhood; a paragraph expressing the donor’s focus on the future recipient; a paragraph wishing the recipient health and happiness; and a concluding paragraph expressing a desire to meet. Id. Based on the documents before the court, the 2016 Letter took “material of substance and value” from the Dorland Letter in such a quantity and in such a manner that the points of similarity outweigh the points of dissimilarity. See Segrets, Inc., 207 F.3d at 62, 66.>>

Con analitico esame ravvisa comunque fair use.

Il giudice esclude tortiuous interference nelle continue dichiaraizoni dell’asserito plagiato verso le contriopati contrattiuali dell’asserito plagiante

Esclude anche che ricorra diffamazione.

Contraffazione musicale di Marvin Gaye da parte di Ed Sheeran ancora negata per improteggibilità della canzone azionata

Il 16 maggio 2023 giudice Stanton,  US district -southern dist. of NY, 18 Civ . 5839 (LLS ), STRUCTURED ASSET SALES , LLC v. Sheeran, Atlantic Recordings +altri,  nega la contraffazione di Let’s get it on di Marving Gaywe dsa parte di Thinking Out di Sheeran (v. link al testo dal sito del Tribunale).

Il ragionameto interessante sotto il profilo sostanziale è sub Analysis 2. Defendants’ Renewed Motion for Summary Judgment is Granted , p. 9 ss., e si concentra sulla proteggibilità di un insieme di due elementi singolarmenet non proteggibili:

<<SAS alleges that the combination of the chord progression
and the harmonic rhythm used in “Thinking Out Loud” is
substantially similar to that in “Let’s Get It On,” and thus
infringes the work. SAS acknowledges, and the Court concurs,
that the chord progression and harmonic rhythm, in isolation,
are not individually protected . The question then is whether two
common elements are numerous enough to make their combination
eligible for copyright protection .(…)This Court is not aware of any case upholding a selection and arrangement claim based on the combination of two
commonplace , unprotectable musical elements . Courts often
evaluate combinations of at least three common musical elements
and still find their selection and arrangement to be unoriginal.

(…) At some level , every work is the selection and arrangement
of unprotectable elements . Musical compositions chiefly adhere
to this template . All songs , after all , are made up of the
” limited number of notes and chords available to composers .”
Gaste v . Kaiserman , 863 F . 2d 1061 , 1068 (2d Cir. 1988) . Within
that limited number , there are even fewer ways to combine the
elements in a manner that is pleasing to the ears . That means a
songwriter only has finite options for playing a commonplace
chord progression . The options are so few that many combinations
have themselves become commonplace , especially in popular music .
If the selection and arrangement of unprotectable elements , in
their combination , is ” so commonplace that it has come to be
expected as a matter of course ,” then it lacks the “minimal
creative spark required by the Copyright Act and the
Constitution” to be original and thus protectable . Feist
Publications , Inc . v . Rural Tel . Serv . Co ., 499 U. S . 340 , 363
(1991) >> .

In conclusione, la canzone azionata non è proteggibile: <<The selection and arrangement of these two musical elements
in “Let’s Get It On” is now commonplace and thus their
combination is unprotectable. If their combination were
protected and not freely available to songwriters, the goal of
copyright law “[t]o promote the Progress of Science and useful
Arts” would be thwarted. U.S. Const. art. I§ 8. The Copyright
Act envisioned that there will be unprotectable elements-based
works “in which the selection, coordination, and arrangement are
not sufficiently original to trigger copyright protection.”
Feist Publications, Inc., 499 U.S. at 358.
As a matter of law , the combination of the chord
progression and harmonic rhythm in “Let ‘ s Get It On ” is too
commonplace to merit copyright protection>>.

Analogo esito pochji giorni prima per le stessi canzoni nella lite Townsend (erede di uno dei due coautori) v. Sheeran (testo però non reperito in rete)

Istruzioni brevi su come non violare un format di programma teatrale

App. Milano n. 1668/2023 del 25.05.2023, RG 2392/2021, rel. Orsenigo,  si sofferma sulle ragioni per cui il format azionato non può ritenersi plagiato.

<<8.1.1.) Tale motivo di appello è del tutto infondato.
Premesso che, come correttamente rilevato dal giudice di prime cure, la comparazione tra i due spettacoli aventi ad oggetto la storia della realizzazione della Cappella Sistina va effettuata guardando alle somiglianze tra i mezzi espressivi impiegati, in quanto è questo il profilo che può conferire il carattere della creatività e della novità all’idea di narrare una vicenda storico-artistica (e non, dunque, l’idea di fondo), dall’analisi della Brochure del 2010 nella quale risulta fissato il progetto di opera “Il Giudizio Universale – A spectacular show” (in particolare, doc. 22 fasc. di parte appellante) e dello spettacolo “Il Giudizio Universale – into the secrets of the Sistine Chapel”, quale risulta visionabile nella sua versione integrale riversata su CD (doc. 11 fasc. parte appellata), emergono differenze sostanziali tra le due opere.
Anzitutto, il primo profilo di diversità deve rinvenirsi nella presenza di dialoghi e parti recitate: invero, dalla Brochure del 2010 emerge l’assenza di dialoghi o di interazioni verbali, in quanto gli unici artisti presenti in scena sono acrobati e ballerini che, quindi, non recitano, ma eseguono coreografie, mentre nello spettacolo “Il Giudizio Universale – into the secrets of the Sistine Chapel” i dialoghi costituiscono l’elemento chiave dell’opera. A ciò si aggiunge anche
un’evidente dissomiglianza tra i due spettacoli dal punto di vista delle modalità espressive e delle modalità di spettacolarizzazione: difatti, lo show abbozzato nella Brochure del 2010 risulta essere un evento spettacolare da realizzarsi con acrobazie e coreografie, alternate ad effetti speciali aerei e pirotecnici (come il muro d’acqua, il fuoco, i fuochi d’artificio, gli acrobati e gli stuntman; si veda, a tal proposito, doc. 22 pagg. 8, 9, 13 e 16 fasc. primo grado parte appellante) e che avrebbe dovuto svolgersi nelle piazze all’aperto con l’uso di “un impianto scenico avvolgente” (cfr. doc. 33, pag. 3, fasc. primo grado parte appellante), mentre, al contrario, il nucleo rappresentativo dello spettacolo “Il Giudizio Universale – into the secrets of the Sistine Chapel”, che si svolge su un palco di teatro tradizionale, è costituito principalmente da giochi di luce e da proiezioni statiche a 270º della Cappella Sistina, con le quali gli attori hanno una costante interazione.
Ancora, un ulteriore profilo di differenziazione tra le due opere si individua nell’elemento spettacolare: dalla Brochure del 2010 emerge, infatti, che la finalità dello spettacolo è quello di intrattenere il pubblico, mentre la rappresentazione “Il Giudizio Universale – into the secrets of the Sistine Chapel” ha il precipuo scopo educativo, in quanto fondata sulla puntuale ricostruzione di una vicenda storica illustrata tramite immagini e dialoghi.
Da tali considerazioni, che evidenziano differenze sostanziali tra i due spettacoli, risultano condivisibili le valutazioni del Tribunale di Milano, che ha ritenuto impossibile ravvisare profili di sovrapponibilità quanto alle modalità rappresentative degli stessi>>.

Il problema della legittimità dell’uso dei training data per lo sviluppo dell’intelligenza artificiale

Il Trib. del Northern District della California 11 maggio 2023, Case 4:22-cv-06823-JST, Doe1 ed alktri c. Github e altri, decide (per ora) la lite promosa da titolari di software caricato sulla piattafoma Github (di MIcrosoft) contro la stessa e contro OpenAI per uso illegittimo dei loro software (in violazione di leggi e di clausole contrattuali).

La fattispecie -non è difficile pronostico-  diverrà sempre più frequente.-

I fatti:

<<In June 2021, GitHub and OpenAI released Copilot, an AI-based program that can “assist software coders by providing or filling in blocks of code using AI.” Id. ¶ 8. In August 2021, OpenAI released Codex, an AI-based program “which converts natural language into code and is integrated into Copilot.” Id. ¶ 9. Codex is integrated into Copilot: “GitHub Copilot uses the OpenAI Codex to suggest code and entire functions in real-time, right from your editor.” Id. ¶ 47 (quoting GitHub website). GitHub users pay $10 per month or $100 per year for access to Copilot. Id. ¶ 8.
Codex and Copilot employ machine learning, “a subset of AI in which the behavior of the program is derived from studying a corpus of material called training data.” Id. ¶ 2. Using this data, “through a complex probabilistic process, [these programs] predict what the most likely solution to a given prompt a user would input is.” Id. ¶ 79. Codex and Copilot were trained on “billions of lines” of publicly available code, including code from public GitHub repositories. Id. ¶¶ 82-83.
Despite the fact that much of the code in public GitHub repositories is subject to open-source licenses which restrict its use, id. ¶ 20, Codex and Copilot “were not programmed to treat attribution, copyright notices, and license terms as legally essential,” id. ¶ 80. Copilot reproduces licensed code used in training data as output with missing or incorrect attribution, copyright notices, and license terms. Id. ¶¶ 56, 71, 74, 87-89. This violates the open-source licenses of “tens of thousands—possibly millions—of software developers.” Id. ¶ 140. Plaintiffs additionally allege that Defendants improperly used Plaintiffs’ “sensitive personal data” by incorporating the data into Copilot and therefore selling and exposing it to third parties. Id. ¶¶ 225-39>>.

MOlte sono le vioalazioni dedotte e per cio il caso è interessante. Alcune domande sono però al momento rigettate per insufficiente precisazione dell’allegaizone , ma con diritto di modifica.

La causa prosegue: vedremo

(notizia e link alla sentenza da Kieran McCarthy nel blog di Eric Goldman)

Bananas duct-taped to a wall: non c’è violazione di copyright nel caso Morford/Cattelan

Il Trib. del Distretto Sud della Florida, giudice Scola, 12 giugno 2023, Case 1:21-cv-20039-RNS, Mordford v. Cattelan, decide con itneressante sentenza la lite tra i due artisti Morford e Cattelan.

Si vedano nella sentenza le due opere a paragone: a prima vista paiono assai simili.

la corte però sfronda applicando -dopo aver affermato che non è data prova dell’access di Cattelan all’0opera azionata- il noto e importante “abstraction-filtration-comparison” test, p. 9.

Esito della filtration:

<<Where does this leave the Court’s filtration analysis? Effectively, it
removes from consideration the largest and most obvious abstracted element of
Banana and Orange: the “banana [that] appears to be fixed to the panel with a
piece of silver duct tape running vertically at a slight angle, left to right.” (Order
Denying Mot. Dismiss at 10.) This expression is not protectible under the
merger doctrine. But that is not to say that Morford’s work is wholly
unprotectible under the doctrine, and this is where the Court diverges from
Cattelan’s position. There are still protectible elements of Morford’s work: (1)
the green rectangular panel on which the fruit is placed; (2) the use of masking
tape to border the panels; (3) the orange on the top panel and banana on the
bottom panel, both of which are centered; (4) the banana’s placement “at a
slight angle, with the banana stalk on the left side pointing up.” (Id.)>>

Ma allora la ripresa da aprte di CAttelan si riduce a poco.

Si v. a p. 14 il paragone sinottico, assai chiaro, che i nostri giudici dovrebbero pure praticare.

In breve resta solo questo:

Reviewing these elements as a whole, it is clear that Banana and Orange
and Comedian share only one common feature that the Court has not already
set aside as unprotectible: both bananas are situated with the banana’s stalk
on the left-hand side of sculpture. This solitary common feature is, on its own,
insignificant and insufficient to support a finding of legal copying. See Altai,
982 F.2d at 710. And the placement of the banana’s stalk (on the right-hand
side of the sculpture versus the left, or vice-versa) would be another element
subject to the merger doctrine anyway: there are only two ways the stalk may
be placed, to the right or to the left. BUC Int’l, 489 F.3d at 1143.

 

(noitizia e link alla sentenza da Eleonora Rosati, IPKat)

Andy Wharol e la sua elaborazione della fotografia di Prince scattata da Lynn Goldsmith: per la decisione della Corte Suprema non c’è fair use

Supreme Court US n. 21-869 del 18 maggio 2023, ANDY WARHOL FOUNDATION FOR THE VISUAL ARTS, INC. v. GOLDSMITH ET AL.  decide l’oggetto.

Decide uno dei temi più importanti del diritto di autore, che assai spesso riguarda opere elaboranti opere precedenti.

Qui riporto il sillabo e per esteso: in sostanza l’esame della SC si appunta solo sul primo elemento dei quattro da conteggiare per decidere sul fair use (In determining whether the use made of a work in any particular case is a fair use the factors to be considered shall include : (1) the purpose and character of the use, including whether such use is of a commercial nature or is for nonprofit educational purposes; ), 17 US code § 107.

<< The “purpose and character” of AWF’s use of Goldsmith’s photograph in commercially licensing Orange Prince to Condé Nast does not favor AWF’s fair use defense to copyright infringement. Pp. 12–38.
(a)
AWF contends that the Prince Series works are “transformative,”and that the first fair use factor thus weighs in AWF’s favor, because the works convey a different meaning or message than the photograph. But the first fair use factor instead focuses on whether an allegedlyinfringing use has a further purpose or different character, which is amatter of degree, and the degree of difference must be weighed againstother considerations, like commercialism. Although new expression, meaning, or message may be relevant to whether a copying use has asufficiently distinct purpose or character, it is not, without more, dis-positive of the first factor. Here, the specific use of Goldsmith’s photograph alleged to infringe her copyright is AWF’s licensing of OrangePrince to Condé Nast. As portraits of Prince used to depict Prince inmagazine stories about Prince, the original photograph and AWF’s copying use of it share substantially the same purpose. Moreover, AWF’s use is of a commercial nature. Even though Orange Prince adds new expression to Goldsmith’s photograph, in the context of the challenged use, the first fair use factor still favors Goldsmith. Pp. 12–27.
(1)
The Copyright Act encourages creativity by granting to the creator of an original work a bundle of rights that includes the rights toreproduce the copyrighted work and to prepare derivative works. 17
U.
S. C. §106. Copyright, however, balances the benefits of incentives to create against the costs of restrictions on copying. This balancingact is reflected in the common-law doctrine of fair use, codified in §107,which provides: “[T]he fair use of a copyrighted work, . . . for purposes such as criticism, comment, news reporting, teaching . . . , scholarship, or research, is not an infringement of copyright.” To determine whether a particular use is “fair,” the statute enumerates four factors to be considered. The factors “set forth general principles, the application of which requires judicial balancing, depending upon relevant circumstances.” Google LLC v. Oracle America, Inc., 593 U. S. ___, ___.
The first fair use factor, “the purpose and character of the use, including whether such use is of a commercial nature or is for nonprofit
educational purposes,” §107(1), considers the reasons for, and nature of, the copier’s use of an original work. The central question it asks is whether the use “merely supersedes the objects of the original creation . . . (supplanting the original), or instead adds something new, with afurther purpose or different character.” Campbell v. Acuff-Rose Music, Inc., 510 U. S. 569, 579 (internal quotation marks and citations omitted). As most copying has some further purpose and many secondary works add something new, the first factor asks “whether and to what extent” the use at issue has a purpose or character different from the original. Ibid. (emphasis added). The larger the difference, the morelikely the first factor weighs in favor of fair use. A use that has a further purpose or different character is said to be “transformative,” but that too is a matter of degree. Ibid. To preserve the copyright owner’s right to prepare derivative works, defined in §101 of the Copyright Act to include “any other form in which a work may be recast, transformed,or adapted,” the degree of transformation required to make “transformative” use of an original work must go beyond that required to qualify as a derivative.
The Court’s decision in Campbell is instructive. In holding that parody may be fair use, the Court explained that “parody has an obvious claim to transformative value” because “it can provide social benefit, by shedding light on an earlier work, and, in the process, creating a new one.” 510 U. S., at 579. The use at issue was 2 Live Crew’s copying of Roy Orbison’s song, “Oh, Pretty Woman,” to create a rap derivative, “Pretty Woman.” 2 Live Crew transformed Orbison’s song by adding new lyrics and musical elements, such that “Pretty Woman” had adifferent message and aesthetic than “Oh, Pretty Woman.” But that did not end the Court’s analysis of the first fair use factor. The Court found it necessary to determine whether 2 Live Crew’s transformationrose to the level of parody, a distinct purpose of commenting on theoriginal or criticizing it. Further distinguishing between parody and satire, the Court explained that “[p]arody needs to mimic an originalto make its point, and so has some claim to use the creation of its victim’s (or collective victims’) imagination, whereas satire can stand on its own two feet and so requires justification for the very act of borrowing.” Id., at 580–581. More generally, when “commentary has no critical bearing on the substance or style of the original composition, . . . the claim to fairness in borrowing from another’s work diminishes accordingly (if it does not vanish), and other factors, like the extent of its commerciality, loom larger.” Id., at 580.
Campbell illustrates two important points. First, the fact that a use is commercial as opposed to nonprofit is an additional element of the first fair use factor. The commercial nature of a use is relevant, but not dispositive. It is to be weighed against the degree to which the use has a further purpose or different character. Second, the first factor relates to the justification for the use. In a broad sense, a use that has a distinct purpose is justified because it furthers the goal of copyright,namely, to promote the progress of science and the arts, without diminishing the incentive to create. In a narrower sense, a use may be justified because copying is reasonably necessary to achieve the user’s new purpose. Parody, for example, “needs to mimic an original to make its point.” Id., at 580–581. Similarly, other commentary or criticism that targets an original work may have compelling reason to “conjure up” the original by borrowing from it. Id., at 588. An independent justification like this is particularly relevant to assessing fairuse where an original work and copying use share the same or highly similar purposes, or where wide dissemination of a secondary work would otherwise run the risk of substitution for the original or licensedderivatives of it. See, e.g., Google, 593 U. S., at ___ (slip op., at 26).
In sum, if an original work and secondary use share the same orhighly similar purposes, and the secondary use is commercial, the first fair use factor is likely to weigh against fair use, absent some other justification for copying. Pp. 13–20.
(2)
The fair use provision, and the first factor in particular, requires an analysis of the specific “use” of a copyrighted work that is alleged to be “an infringement.” §107. The same copying may be fairwhen used for one purpose but not another. See Campbell, 510 U. S., at 585. Here, Goldsmith’s copyrighted photograph has been used in multiple ways. The Court limits its analysis to the specific use allegedto be infringing in this case—AWF’s commercial licensing of Orange Prince to Condé Nast—and expresses no opinion as to the creation, display, or sale of the original Prince Series works. In the context of Condé Nast’s special edition magazine commemorating Prince, the purpose of the Orange Prince image is substantially the same as thatof Goldsmith’s original photograph. Both are portraits of Prince used in magazines to illustrate stories about Prince. The use also is of a commercial nature. Taken together, these two elements counsel against fair use here. Although a use’s transformativeness may outweigh its commercial character, in this case both point in the same direction. That does not mean that all of Warhol’s derivative works, nor all uses of them, give rise to the same fair use analysis. Pp. 20–27.
(b)
AWF contends that the purpose and character of its use of Goldsmith’s photograph weighs in favor of fair use because Warhol’s silkscreen image of the photograph has a different meaning or message. By adding new expression to the photograph, AWF says, Warhol madetransformative use of it. Campbell did describe a transformative use as one that “alter[s] the first [work] with new expression, meaning, or message.” 510 U. S., at 579. But Campbell cannot be read to mean that §107(1) weighs in favor of any use that adds new expression, meaning, or message. Otherwise, “transformative use” would swallow the copyright owner’s exclusive right to prepare derivative works, asmany derivative works that “recast, transfor[m] or adap[t]” the original, §101, add new expression of some kind. The meaning of a secondary work, as reasonably can be perceived, should be considered to the extent necessary to determine whether the purpose of the use is distinct from the original. For example, the Court in Campbell considered the messages of 2 Live Crew’s song to determine whether the song hada parodic purpose. But fair use is an objective inquiry into what a user does with an original work, not an inquiry into the subjective intent of the user, or into the meaning or impression that an art critic or judge draws from a work.
Even granting the District Court’s conclusion that Orange Prince reasonably can be perceived to portray Prince as iconic, whereas Goldsmith’s portrayal is photorealistic, that difference must be evaluatedin the context of the specific use at issue. The purpose of AWF’s recent commercial licensing of Orange Prince was to illustrate a magazine about Prince with a portrait of Prince. Although the purpose could bemore specifically described as illustrating a magazine about Prince with a portrait of Prince, one that portrays Prince somewhat differently from Goldsmith’s photograph (yet has no critical bearing on her photograph), that degree of difference is not enough for the first factor to favor AWF, given the specific context and commercial nature of the use. To hold otherwise might authorize a range of commercial copying of photographs to be used for purposes that are substantially the sameas those of the originals.
AWF asserts another related purpose of Orange Prince, which is tocomment on the “dehumanizing nature” and “effects” of celebrity. No doubt, many of Warhol’s works, and particularly his uses of repeated images, can be perceived as depicting celebrities as commodities. But even if such commentary is perceptible on the cover of Condé Nast’s tribute to “Prince Rogers Nelson, 1958–2016,” on the occasion of the man’s death, the asserted commentary is at Campbell’s lowest ebb: It “has no critical bearing on” Goldsmith’s photograph, thus the commentary’s “claim to fairness in borrowing from” her work “diminishes accordingly (if it does not vanish).” Campbell, 510 U. S., at 580. The commercial nature of the use, on the other hand, “loom[s] larger.” Ibid. Like satire that does not target an original work, AWF’s asserted commentary “can stand on its own two feet and so requires justification forthe very act of borrowing.” Id., at 581. Moreover, because AWF’s copying of Goldsmith’s photograph was for a commercial use so similar to the photograph’s typical use, a particularly compelling justification is needed. Copying the photograph because doing so was merely helpfulto convey a new meaning or message is not justification enough. Pp.28–37.
(c) Goldsmith’s original works, like those of other photographers, areentitled to copyright protection, even against famous artists. Such protection includes the right to prepare derivative works that transform the original. The use of a copyrighted work may nevertheless be fair if, among other things, the use has a purpose and character that is sufficiently distinct from the original. In this case, however, Goldsmith’s photograph of Prince, and AWF’s copying use of the photograph in an image licensed to a special edition magazine devoted to Prince, share substantially the same commercial purpose. AWF has offered no other persuasive justification for its unauthorized use of thephotograph. While the Court has cautioned that the four statutory fairuse factors may not “be treated in isolation, one from another,” but instead all must be “weighed together, in light of the purposes of copyright,” Campbell, 510 U. S., at 578, here AWF challenges only the Court of Appeals’ determinations on the first fair use factor, and theCourt agrees the first factor favors Goldsmith. P. 38 >>

Per quanto elevata la creatività di Wharol, non si può negare che egli si sia appoggiato a quella della fotografa.

Da noi lo sfruttamento dell’opera elaborata, pe quanto creativa questa sia,  sempre richiede il consenso del titolare dell’opera base (a meno che il legame tra le due sia evanescente …).

Decisione a maggioranza, con opinione dissenziente di Kagan cui si è unito Roberts. Dissenso assai articolato, basato soprattutto sul ravvisare uso tranformative e sul ridurre l’importanza dello sfruttamento economico da parte di Wharol. Riporto solo questo :

<<Now recall all the ways Warhol, in making a Prince portrait from the Goldsmith photo, “add[ed] something new, with a further purpose or different character”—all the wayshe “alter[ed] the [original work’s] expression, meaning, [and] message.” Ibid. The differences in form and appearance, relating to “composition, presentation, color palette, and media.” 1 App. 227; see supra, at 7–10. The differences in meaning that arose from replacing a realistic—and indeed humanistic—depiction of the performer with an unnatural, disembodied, masklike one. See ibid. The conveyance of new messages about celebrity culture and itspersonal and societal impacts. See ibid. The presence of, in a word, “transformation”—the kind of creative building that copyright exists to encourage. Warhol’s use, to be sure, had a commercial aspect. Like most artists, Warhol did not want to hide his works in a garret; he wanted to sell them.But as Campbell and Google both demonstrate (and as further discussed below), that fact is nothing near the showstopper the majority claims. Remember, the more trans-formative the work, the less commercialism matters. See Campbell, 510 U. S., at 579; supra, at 14; ante, at 18 (acknowledging the point, even while refusing to give it any meaning). The dazzling creativity evident in the Prince portrait might not get Warhol all the way home in the fair-use inquiry; there remain other factors to be considered and possibly weighed against the first one. See supra, at 2, 10,
14. But the “purpose and character of [Warhol’s] use” of the copyrighted work—what he did to the Goldsmith photo, in service of what objects—counts powerfully in his favor. He started with an old photo, but he created a new new thing>>.