• No, Microsoft isn’t stealing your data to feed Copilot

    Home » Forums » Newsletter and Homepage topics » No, Microsoft isn’t stealing your data to feed Copilot

    • This topic has 9 replies, 6 voices, and was last updated 5 months ago.
    Author
    Topic
    #2727867

    MICROSOFT 365 By Peter Deegan Social media “experts” are touting a false “fix” to stop Microsoft from using your Word, Excel, or PowerPoint files to t
    [See the full post at: No, Microsoft isn’t stealing your data to feed Copilot]

    6 users thanked author for this post.
    Viewing 5 reply threads
    Author
    Replies
    • #2727877

      This is an interesting article. I was playing around with a very small awk program (maybe a dozen lines) that I had written to test some things and I was having a problem with it. So I thought why not try Copilot. I gave it the general particulars of the code and I could not believe what it gave me back. It returned almost line for line the code I had written including a comment I had inserted. Where did it get this? The good news is that it corrected the issue I had. A few days later I tried some Powershell code trying to find some methods of doing a procedure and it gave me my awk code back even though I had specified to use Powershell. It apparently somehow remembered that from before. Don’t know how. Copilot is still a work in progress.

      JohnD

      • #2727881

        The code similarity is most likely “great minds think alike” plus consistent code styles.

        Any AI will get things wrong, for example the differences between Word, Excel and PPT code. https://office-watch.com/2023/more-office-vba-chatgpt/

        So I’m not surprised that it gets languages confused too.

        As with most things, using AI to write code is a big help – maybe 80-90% of the basic stuff done. It’s that last 20% ish that need human intervention.

        Peter Deegan
        Office-Watch.com

    • #2728034

      The headline of the article is “No, Microsoft isn’t stealing your data to feed Copilot”, but as is mentioned in the article, Microsoft goes out of its way to not mention perpetual-licence Office documents, web Office documents and 365 Basic Office documents, as well as all cloud data, as being excluded from being used to train Copilot, so I don’t see how the claim in the headline can be made.

      I would honestly be shocked if Microsoft wasn’t training Copilot on these things, and more things besides – that’s the only logical reason that Microsoft would have to specify other things but not them. After all, if Microsoft wasn’t training Copilot on any of its users’ data then it would simply say “we do not train Copilot on your data”, not give a very specific list of things that it doesn’t train Copilot on.

      At the end of the day, Microsoft simply has too much to gain by training Copilot on its users’ data. Sure, it will exclude corporate clients from this, because for now it needs to keep them on side, but the advantage in the AI race is in large part based on who has access to the most and best training data, and Microsoft isn’t going to freely give up such an advantage just for PR.

      2 users thanked author for this post.
    • #2728048

      I’m afraid there’s been some misunderstanding about both the intent and content of the article.

      Firstly, there was absolutely NO intention to deceive in the headline.  The article was a rebuttal for a meme that’s been going around social media and wasting a lot of time as folks turn off “Connected Services” only to find their Office apps not working fully afterwards.

      Microsoft goes out of its way to not mention perpetual-licence Office documents, web Office documents and 365 Basic Office documents, as well as all cloud data

      The items listed under the heading “Who is not mentioned in Microsoft’s promise” is NOT a list of customer data that is definitely used to teach Copilot.  It’s my short list of customers who do not appear to be included in Microsoft’s promise — that doesn’t mean their data is being ‘farmed’.

      For example, it’s possible that perpetual licence Office users aren’t mentioned just because there’s no direct Copilot integration in those products.

      As to ‘cloud data’ — customers mentioned in the MS exclusion list would include OneDrive/Sharepoint.  I merely noted that Microsoft should explicitly say that to allay suspicions.

      At the end of the day, Microsoft simply has too much to gain by training Copilot on its users’ data. Sure,

      As I mentioned — using private customer data (commercial or personal) to train AI is risky from both a legal and commercial / PR viewpoints.  OpenAI among others are fighting lawsuits over use of copyrighted info so they should be extra careful about what data they use for teaching purposes.

      AI race is in large part based on who has access to the most and best training data,

      My understanding is that the technology has developed to the point where quantity of data in the LLM isn’t as important as quality in both organizing that data and getting better responses (speed, accuracy and in differing formats).  All the AI players are focused on those ‘quality’ issues more than trying to shovel in more data.

      For many years I’ve kept any confidential data either on local storage, in encrypted files or both.  I was recommending that long before Microsoft ‘discovered’ AI/Copilot.

      May I again mention  Escape from the clutches of OneDrive (AskWoody, July 8, 2024) which explains how to have “The Best of Both Worlds” (TNG 3.26 & 4.01 <g>) .  In other words a prudent mix of local storage plus cloud storage — the latter for files you need to either collaborate or share between your own devices.

      Peter Deegan
      Office-Watch.com

       

       

      2 users thanked author for this post.
      • #2728099

        Please don’t get me wrong, I wasn’t for one moment implying that you were trying to deceive anyone, I just disagree with the premise of the headline and the premise that Microsoft wouldn’t train Copilot on its users’ data, that’s all.

        I think the main point of disagreement comes down to whether one believes that Microsoft would be willing to risk a lawsuit to push ahead with training on its users’ data, and I very much think they would, as the modus operandi of technology companies over the past decade or two seems very much to push the legal limits (to put it kindly), knowing that they will get nothing more than a small (for them) monetary fine if they get caught.

        As for the PR standpoint, Microsoft has shown time and time again in recent years that it is willing to endure poor PR to push its competitive advantage and profits, presumably because it knows that it has mostly a monopoly in many spaces and thus poor PR is going to lose it few customers. I can’t see a reason to believe that this has changed.

        I know that you weren’t claiming anything by mentioning them, but because of these two things and because of the potential profits at stake, when I see that Microsoft has specifically omitted certain things from a statement, I can’t help but be suspicious of them.

        But anyway, regardless of the disagreement, I enjoyed the article, as I always do, and found it very informative. Thank you for your work. I hope that you, rather than I, are right on this matter!

        2 users thanked author for this post.
        • #2728105

          Thanks for that considerate reply and kind words.

          My POV is that Microsoft CAN always be trusted – trusted to act in its own best interests <g>.

          In this case I think/hope that means not scraping customers documents too, shall we say, enthusiastically.

          Peter Deegan
          Office-Watch.com

          2 users thanked author for this post.
    • #2728160

      First and foremost, if you can’t write an “official sounding” letter without the aide of AI, GO BACK TO SCHOOL, starting with 4th grade!!!!!

      I’m not worried about co-pilot stealing my data, as it is not on a single one of my PC’s. As for word, Excel and Powerpoint, I use Office 2021. It didn’t come with co-pilot and since it never expires, except getting updates, I can use it forever, just like the Office 2010 I have on three other machines.

    • #2728223

      Nice editorial Peter,

      After reading this you can understand why people still would not have any trust. Microsoft basically just like treating people like Mushrooms, feeding them confusing Bull**** and keep them in the dark.

      1 user thanked author for this post.
    • #2728338

      Thanks very much for the research Peter. I know you’re a thoughtful, well-intentioned researcher. I don’t trust Microsoft on this topic. And it’s unlikely I ever would.

      Human, who sports only naturally-occurring DNA ~ oneironaut ~ broadcaster

      2 users thanked author for this post.
    Viewing 5 reply threads
    Reply To: No, Microsoft isn’t stealing your data to feed Copilot

    You can use BBCodes to format your content.
    Your account can't use all available BBCodes, they will be stripped before saving.

    Your information: